File Formats Assessments

Revision as of 09:00, 16 November 2023 by Swhibley (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search
FileFormatsMainBlue.jpg File formats are a means of structuring information in a sensible way for storage, retrieval and use. There are a wealth of different formats supporting a range of data types, from specific instances to container formats able to store different types of data.

As discussed in their iPRES paper “Sustainability Assessments at the British Library: Formats, Frameworks and Findings”, the Digital Preservation Team at the British Library has undertaken file format assessments to capture knowledge about the gaps in current best practice, understanding and capability in working with specific file formats. The focus of each assessment is on capturing evidence-based preservation risks and the implications of institutional obsolescence which lead to problems maintaining the content over time.

The British Library’s assessments are being made available via this DPC wiki page in order to share their findings and facilitate engagement with the broader preservation community.

Feedback is always welcome. If you have any comments or suggestions, please email: DPT at the British Library

BL Logo (Big).jpg


The British Library, The Library of Congress, Harvard Library, NARA and the Digital Preservation Coalition are beginning a new collaboration to coordinate and make available their file format assessments. This will grow the pool of assessments available, while avoiding duplication, increasing the quality, and minimising the effort of maintenance. As a first stage, these organisations are coordinating their next assessment work here.

Preservation Risk Assessments - Summaries

Ebooks.png eBook Summary

A broad overview of formats available within the eBook sector.

Preservation Risk Assessments by Format Type

Assessment Criteria

See the format assessment factors covered in each assessment.

Icon-TIFF.png Tagged Image File Format

A widely-supported raster format for images.

WhiteBorder100.jpg Icon-JP2.png JPEG 2000

A compression standard and coding system for images, created by the Joint Photographic Experts Group.

Icon-PDF.png Portable Document Format

A file format optimised for the consistent display of text and embedded images, regardless of platform.

WhiteBorder100.jpg Icon-PDFA small2.png PDF/A

A sub format of PDF for long-term archiving of electronic documents.

WhiteBorder100.jpg Icon-EPUB.png EPUB An open standard for electronic books (eBooks) and other content types published by the International Digital Publishing Forum (IDPF).
Icon-JATS.png Journal Article Tag Suite

An XML-based mark-up standard for e-Journal content, based on the earlier NLM Archiving and Interchange DTD.

WhiteBorder100.jpg Icon-ODT.png Open Document Text

A format for editable textual documents that is part of the ISO 26300 OpenDocument Format family that is maintained by OASIS.

WhiteBorder100.jpg Icon-MOBI.png Mobipocket Format A proprietary standard for electronic book (eBook) content; used by Amazon as the basis of its AZW and KF8 formats.
Icon-NTF.png National Transfer Format

A vector format standard (BS 7567) developed in the 1980s for the transfer of geospatial information, now mostly obsolete.

Icon-WAV.png Waveform Audio File Format

An audio file format standard recommended by several professional bodies and memory institutions for the long-term preservation of audio files.

WhiteBorder100.jpg Icon-FLAC.png FLAC (Free Lossless Audio Codec)

A non-proprietary open source lossless audio file format.

WhiteBorder100.jpg Icon-MP3.png MP3 (MPEG Audio Layer III)

A widely available and supported but lossy audio file format.

Icon-SIB.png Sibelius Format

A proprietary format for music notation designed to be used with Avid Software's Sibelius composing and music editing software.

WhiteBorder100.jpg Icon-MusicXML.png MusicXML Format

An XML-based exchange format for music notation, currently developed by the W3C Music Notation Community Group.

Icon-XML.png Extensible Markup Language

A generic markup language for the encoding of text and data; specification maintained by the World Wide Web Consortium (W3C).