File Formats Assessments
|File formats are a means of structuring information in a sensible way for storage, retrieval and use. There are a wealth of different formats supporting a range of data types, from specific instances to container formats able to store different types of data.
As discussed in their iPRES paper “Sustainability Assessments at the British Library: Formats, Frameworks and Findings”, the Digital Preservation Team at the British Library has undertaken file format assessments to capture knowledge about the gaps in current best practice, understanding and capability in working with specific file formats. The focus of each assessment is on capturing evidence-based preservation risks and the implications of institutional obsolescence which lead to problems maintaining the content over time.
The British Library’s assessments are being made available via this DPC wiki page in order to share their findings and facilitate engagement with the broader preservation community.
Feedback is always welcome. If you have any comments or suggestions, please email: DPT at the British Library
The British Library, The Library of Congress, Harvard Library, NARA and the Digital Preservation Coalition are beginning a new collaboration to coordinate and make available their file format assessments. This will grow the pool of assessments available, while avoiding duplication, increasing the quality, and minimising the effort of maintenance. As a first stage, these organisations are coordinating their next assessment work here.
- Library of Congress assessments
- Harvard University Library assessments
- National Archives Records Administration assessments (coming soon)
Preservation Risk Assessments - Summaries
| eBook Summary
A broad overview of formats available within the eBook sector.
Preservation Risk Assessments by Format Type
| Tagged Image File Format
A widely-supported raster format for images.
| JPEG 2000
A compression standard and coding system for images, created by the Joint Photographic Experts Group.
| Portable Document Format
A file format optimised for the consistent display of text and embedded images, regardless of platform.
An open standard for electronic books (eBooks) and other content types published by the International Digital Publishing Forum (IDPF).
| Journal Article Tag Suite
An XML-based mark-up standard for e-Journal content, based on the earlier NLM Archiving and Interchange DTD.
| Open Document Text
A format for editable textual documents that is part of the ISO 26300 OpenDocument Format family that is maintained by OASIS.
| Mobipocket Format
A proprietary standard for electronic book (eBook) content; used by Amazon as the basis of its AZW and KF8 formats.
| National Transfer Format
A vector format standard (BS 7567) developed in the 1980s for the transfer of geospatial information, now mostly obsolete.
|Geography Markup Language
An XML grammar for expressing geographical features used as a modelling language for GIS and cartographic products.
| Waveform Audio File Format
An audio file format standard recommended by several professional bodies and memory institutions for the long-term preservation of audio files.
| FLAC (Free Lossless Audio Codec)
A non-proprietary open source lossless audio file format.
| MP3 (MPEG Audio Layer III)
A widely available and supported but lossy audio file format.
|DIGITAL SHEET MUSIC FORMATS|
| Sibelius Format
A proprietary format for music notation designed to be used with Avid Software's Sibelius composing and music editing software.
| MusicXML Format
An XML-based exchange format for music notation, currently developed by the W3C Music Notation Community Group.
| Extensible Markup Language
A generic markup language for the encoding of text and data; specification maintained by the World Wide Web Consortium (W3C).