File Formats Assessments: Difference between revisions

From wiki.dpconline.org
Jump to navigation Jump to search
No edit summary
No edit summary
(30 intermediate revisions by 3 users not shown)
Line 4: Line 4:
|File formats are a means of structuring information in a sensible way for storage, retrieval and use. There are a wealth of different formats supporting a range of data types, from specific instances to container formats able to store different types of data.
|File formats are a means of structuring information in a sensible way for storage, retrieval and use. There are a wealth of different formats supporting a range of data types, from specific instances to container formats able to store different types of data.


As discussed in their iPRES paper “[https://fedora.phaidra.univie.ac.at/fedora/get/o:378110/bdef:Content/get Sustainability Assessments at the British Library: Formats, Frameworks and Findings]”, '''the Digital Preservation Team at the British Library''' has undertaken file format assessments to capture knowledge about the gaps in current best practice, understanding and capability in working with specific file formats. The focus of each assessment is on capturing evidence-based preservation risks and the implications of institutional obsolescence which lead to problems maintaining the content over time.
As discussed in their iPRES paper “[https://fedora.phaidra.univie.ac.at/fedora/get/o:378110/bdef:Content/get Sustainability Assessments at the British Library: Formats, Frameworks and Findings]”, the [https://www.bl.uk/digital-preservation Digital Preservation Team at the British Library] has undertaken file format assessments to capture knowledge about the gaps in current best practice, understanding and capability in working with specific file formats. The focus of each assessment is on capturing evidence-based preservation risks and the implications of institutional obsolescence which lead to problems maintaining the content over time.


The British Library’s assessments are being made available via this DPC wiki page in order to share their findings and facilitate engagement with the broader preservation community.
The British Library’s assessments are being made available via this DPC wiki page in order to share their findings and facilitate engagement with the broader preservation community.
Line 29: Line 29:
=== Preservation Risk Assessments by Format Type ===
=== Preservation Risk Assessments by Format Type ===
----
----
{|
{| border="0"
!colspan="3" style="text-align: left"|IMAGE FORMATS
!colspan="3" style="text-align: left"|IMAGE FORMATS
|- style="vertical-align:top;"
|- style="vertical-align:top;"
Line 45: Line 45:
!colspan="3" style="text-align: left"|DOCUMENT FORMATS
!colspan="3" style="text-align: left"|DOCUMENT FORMATS
|- style="vertical-align:top;"
|- style="vertical-align:top;"
|[[File:Icon-PDF.png|80px|link={{filepath:PDF_Assessment_v1.3.pdf}}]]
|[[File:Icon-PDF.png|80px|link={{filepath:PDF_Assessment_v1.5.pdf}}]]
|[[Media:PDF_Assessment_v1.3.pdf | '''Portable Document Format''']]
|[[Media:PDF_Assessment_v1.5.pdf | '''Portable Document Format''']]
A file format optimised for the consistent display of text and embedded images, regardless of platform.
A file format optimised for the consistent display of text and embedded images, regardless of platform.
<h5 style="color:red;">UPDATED! </h5>
|[[File:WhiteBorder100.jpg|10px]]
|[[File:Icon-PDFA_small2.png|80px|link={{filepath:PDFA_Assessment_v1.0.pdf}}]]
|[[Media:PDFA_Assessment_v1.0.pdf | '''PDF/A''']]
A sub format of PDF for long-term archiving of electronic documents.
<h5 style="color:red;">NEW! </h5>
|[[File:WhiteBorder100.jpg|10px]]
|[[File:WhiteBorder100.jpg|10px]]
|[[File:Icon-EPUB.png|80px|link={{filepath:EPUB_Assessment_v1.2.pdf}}]]
|[[File:Icon-EPUB.png|80px|link={{filepath:EPUB_Assessment_v1.4a.pdf}}]]
|[[Media:EPUB_Assessment_v1.2.pdf | '''EPUB''']]
|[[Media:EPUB_Assessment_v1.4a.pdf | '''EPUB''']]
An open standard for electronic books (eBooks) and other content types published by the International Digital Publishing Forum (IDPF).
An open standard for electronic books (eBooks) and other content types published by the International Digital Publishing Forum (IDPF). <h5 style="color:red;">UPDATED! </h5>
|[[File:WhiteBorder100.jpg|10px]]
|[[File:WhiteBorder100.jpg|10px]]
|-
|&nbsp;
|-
!colspan="3" style="text-align: left"|'''
|- style="vertical-align:top;"
|[[File:Icon-JATS.png|80px|link={{filepath:JATS NLM Assessment v1.3.pdf}}]]
|[[File:Icon-JATS.png|80px|link={{filepath:JATS NLM Assessment v1.3.pdf}}]]
|[[Media:JATS NLM Assessment v1.3.pdf | '''Journal Article Tag Suite''']]
|[[Media:JATS NLM Assessment v1.3.pdf | '''Journal Article Tag Suite''']]
Line 60: Line 71:
|[[Media:ODT Assessment-v1.pdf | '''Open Document Text''']]
|[[Media:ODT Assessment-v1.pdf | '''Open Document Text''']]
A format for editable textual documents that is part of the ISO 26300 OpenDocument Format family that is maintained by OASIS.
A format for editable textual documents that is part of the ISO 26300 OpenDocument Format family that is maintained by OASIS.
<span style="color:white"> Coming sooooooooooooooon!</span>
|[[File:WhiteBorder100.jpg|10px]]
|[[File:Icon-MOBI.png|80px|link={{filepath:Mobipocket_Assessment_v1.1a.pdf}}]]
|[[Media:Mobipocket_Assessment_v1.1a.pdf | '''Mobipocket Format''']]
A proprietary standard for electronic book (eBook) content; used by Amazon as the basis of its AZW and KF8 formats. <h5 style="color:red;">UPDATED! </h5>
|[[File:WhiteBorder100.jpg|10px]]
|-
|-
|&nbsp;
|&nbsp;
Line 70: Line 85:
A vector format standard (BS 7567) developed in the 1980s for the transfer of geospatial information, now mostly obsolete.  
A vector format standard (BS 7567) developed in the 1980s for the transfer of geospatial information, now mostly obsolete.  
|[[File:WhiteBorder100.jpg|10px]]
|[[File:WhiteBorder100.jpg|10px]]
|[[File:Icon-GML.png|80px|link=]]
|'''Geography Markup Language'''
<h5 style="color:red;text-align:center;">NEW! </h5>
|-
|-
|&nbsp;
|&nbsp;
Line 85: Line 97:
|[[Media: FLAC_Assessment_v1.0.pdf | '''FLAC (Free Lossless Audio Codec)''']]
|[[Media: FLAC_Assessment_v1.0.pdf | '''FLAC (Free Lossless Audio Codec)''']]
A non-proprietary open source lossless audio file format.
A non-proprietary open source lossless audio file format.
<h5 style="color:red;text-align:center;">NEW! </h5>
|[[File:WhiteBorder100.jpg|10px]]
|[[File:WhiteBorder100.jpg|10px]]
|[[File:Icon-MP3.png|80px|link={{filepath:MP3_Assessment_v1.0.pdf}}]]
|[[File:Icon-MP3.png|80px|link={{filepath:MP3_Assessment_v1.0.pdf}}]]
|[[Media: MP3_Assessment_v1.0.pdf | '''MP3 (MPEG Audio Layer III)''']]
|[[Media: MP3_Assessment_v1.0.pdf | '''MP3 (MPEG Audio Layer III)''']]
A widely available and supported but lossy audio file format.
A widely available and supported but lossy audio file format.
<h5 style="color:red;text-align:center;">NEW! </h5>
|[[File:WhiteBorder100.jpg|10px]]
|-
|-
|&nbsp;
|&nbsp;
Line 96: Line 107:
!colspan="3" style="text-align: left"|DIGITAL SHEET MUSIC FORMATS
!colspan="3" style="text-align: left"|DIGITAL SHEET MUSIC FORMATS
|- style="vertical-align:top;"
|- style="vertical-align:top;"
|[[File:Icon-SIB.png|80px|link={{filepath:Sibelius_Assessment_v1.0.pdf}}]]
|[[File:Icon-SIB.png|80px|link={{filepath:Sibelius_Assessment_v1.15.pdf}}]]
|[[Media:Sibelius_Assessment_v1.0.pdf | '''Sibelius Format''']]
|[[Media:Sibelius_Assessment_v1.15.pdf | '''Sibelius Format''']]
A proprietary format for music notation designed to be used with Avid Software's Sibelius composing and music editing software.
A proprietary format for music notation designed to be used with Avid Software's Sibelius composing and music editing software. <h5 style="color:red;">UPDATED! </h5>
<h5 style="color:red;text-align:center;">NEW! </h5>|-
|[[File:WhiteBorder100.jpg|10px]]
|[[File:Icon-MusicXML.png|80px|link={{filepath:MusicXML_Format_Assessment_v1.15.pdf}}]]
|[[Media:MusicXML_Format_Assessment_v1.15.pdf | '''MusicXML Format''']]
An XML-based exchange format for music notation, currently developed by the W3C Music Notation Community Group. <h5 style="color:red;">UPDATED! </h5>
|[[File:WhiteBorder100.jpg|10px]]
|-
|&nbsp;
|-
|-
!colspan="3" style="text-align: left"|GENERIC FORMATS
!colspan="3" style="text-align: left"|GENERIC FORMATS
Line 106: Line 123:
|[[Media:XML_Assessment_v1.3.pdf | '''Extensible Markup Language''']]
|[[Media:XML_Assessment_v1.3.pdf | '''Extensible Markup Language''']]
A generic markup language for the encoding of text and data; specification maintained by the World Wide Web Consortium (W3C).
A generic markup language for the encoding of text and data; specification maintained by the World Wide Web Consortium (W3C).
|[[File:WhiteBorder100.jpg|10px]]
|-
|-


=== Assessment Criteria ===
=== Assessment Criteria ===
See the '''[[File Format Assessment Factors | format assessment factors]]''' covered in each assessment.
See the '''[[File Format Assessment Factors | format assessment factors]]''' covered in each assessment.

Revision as of 15:33, 26 February 2020

FileFormatsMainBlue.jpg File formats are a means of structuring information in a sensible way for storage, retrieval and use. There are a wealth of different formats supporting a range of data types, from specific instances to container formats able to store different types of data.

As discussed in their iPRES paper “Sustainability Assessments at the British Library: Formats, Frameworks and Findings”, the Digital Preservation Team at the British Library has undertaken file format assessments to capture knowledge about the gaps in current best practice, understanding and capability in working with specific file formats. The focus of each assessment is on capturing evidence-based preservation risks and the implications of institutional obsolescence which lead to problems maintaining the content over time.

The British Library’s assessments are being made available via this DPC wiki page in order to share their findings and facilitate engagement with the broader preservation community.

Feedback is always welcome. If you have any comments or suggestions, please email: DPT at the British Library

BL Logo (Big).jpg

Collaboration

The British Library, The Library of Congress, Harvard Library, NARA and the Digital Preservation Coalition are beginning a new collaboration to coordinate and make available their file format assessments. This will grow the pool of assessments available, while avoiding duplication, increasing the quality, and minimising the effort of maintenance. As a first stage, these organisations are coordinating their next assessment work here.

Preservation Risk Assessments - Summaries


Ebooks.png eBook Summary

A broad overview of formats available within the eBook sector.

Preservation Risk Assessments by Format Type


Assessment Criteria

See the format assessment factors covered in each assessment.

IMAGE FORMATS
Icon-TIFF.png Tagged Image File Format

A widely-supported raster format for images.

WhiteBorder100.jpg Icon-JP2.png JPEG 2000

A compression standard and coding system for images, created by the Joint Photographic Experts Group.

WhiteBorder100.jpg
 
DOCUMENT FORMATS
Icon-PDF.png Portable Document Format

A file format optimised for the consistent display of text and embedded images, regardless of platform.

UPDATED!
WhiteBorder100.jpg Icon-PDFA small2.png PDF/A

A sub format of PDF for long-term archiving of electronic documents.

NEW!
WhiteBorder100.jpg Icon-EPUB.png EPUB An open standard for electronic books (eBooks) and other content types published by the International Digital Publishing Forum (IDPF).
UPDATED!
WhiteBorder100.jpg
 
Icon-JATS.png Journal Article Tag Suite

An XML-based mark-up standard for e-Journal content, based on the earlier NLM Archiving and Interchange DTD.

WhiteBorder100.jpg Icon-ODT.png Open Document Text

A format for editable textual documents that is part of the ISO 26300 OpenDocument Format family that is maintained by OASIS.

WhiteBorder100.jpg Icon-MOBI.png Mobipocket Format A proprietary standard for electronic book (eBook) content; used by Amazon as the basis of its AZW and KF8 formats.
UPDATED!
WhiteBorder100.jpg
 
GEOSPATIAL FORMATS
Icon-NTF.png National Transfer Format

A vector format standard (BS 7567) developed in the 1980s for the transfer of geospatial information, now mostly obsolete.

WhiteBorder100.jpg
 
AUDIOVISUAL FORMATS
Icon-WAV.png Waveform Audio File Format

An audio file format standard recommended by several professional bodies and memory institutions for the long-term preservation of audio files.

WhiteBorder100.jpg Icon-FLAC.png FLAC (Free Lossless Audio Codec)

A non-proprietary open source lossless audio file format.

WhiteBorder100.jpg Icon-MP3.png MP3 (MPEG Audio Layer III)

A widely available and supported but lossy audio file format.

WhiteBorder100.jpg
 
DIGITAL SHEET MUSIC FORMATS
Icon-SIB.png Sibelius Format A proprietary format for music notation designed to be used with Avid Software's Sibelius composing and music editing software.
UPDATED!
WhiteBorder100.jpg Icon-MusicXML.png MusicXML Format An XML-based exchange format for music notation, currently developed by the W3C Music Notation Community Group.
UPDATED!
WhiteBorder100.jpg
 
GENERIC FORMATS
Icon-XML.png Extensible Markup Language

A generic markup language for the encoding of text and data; specification maintained by the World Wide Web Consortium (W3C).

WhiteBorder100.jpg