Jpylyzer business case template

From wiki.dpconline.org
Jump to navigation Jump to search
Jpylyzer logo

Use this to help develop a convincing business case for the use of Jpylyzer in your organisation

About this business case template

This template presents generic business benefits, digital preservation risks and costs for applying Jpylyzer in a production environment, followed by an example of how these benefits might be tailored for, and presented in, a specific business case. Elements from this case study are mirrored in relevant parts of the Toolkit (for example the Benefits from this case study are mirrored on the DPBCT benefits template page). This template was developed by the SCAPE Project.

How to use this template

The sections on benefits, risks and costs can be reused by organisations who would like to create a business case for the use of Jpylyzer, but they must be tailored to that organisation's particular needs, aims and contextual situation. The Jpylyzer Business Case Example shows how the generic business benefits and risks can be adapted to meet the specific needs of a (theoretical) organisation. Developing benefits and risks requires careful analysis, adaptation, use of language and prioritisation as described elsewhere within the Digital Preservation Business Case Toolkit. The Step by step guide to building a business case is a good place to start. Before reading through this template any further, ensure you first have a good understanding of the function of Jpylyzer.

About Jpylyzer

Jpylyzer is a software tool designed to help identify problematic JP2 files that might be broken or invalid in some way or identify JP2 files that do not conform to a particular technical profile. As such it provides useful checking and analysis of acquired JP2 files as well as playing a role in ensuring the quality and consistency of JP2 files in a digitisation workflow.

Jpylyzer benefits

Use this as a starting point for the benefits of using Jpylyzer at your organisation. Note that there may be benefits more specific to your organisation or circumstances that are not covered here so do your own brainstorm as well.

Jpylyzer benefit summary

This section provides a summary of generic business benefits for using Jpylyzer.

Direct benefits:

  • Mitigates key JP2 preservation risks
  • Increases the quality and ensures the consistency of the construction of JP2 files
  • Ensures created JP2 files comply with an organisation's (policy driven) JP2 profile
  • Reduces data management and processing costs by catching bad files early
  • Enables efficient quality assurance of JP2 files created by 3rd party digitising organisations

Indirect benefits:

  • Enables application of JP2 format (and powerful JPEG2000 compression) for storing digitised masters (by mitigating preservation risks as described above) and therefore:
    • Significantly reduces storage costs for digitised collections or frees up resource for additional digitisation and/or storage
    • Enhances remote access and the optimised delivery of images, providing a better experience for the user

Jpylyzer benefits by SCAPE dimensions of scalability

This section describes generic business benefits for using Jpylyzer in the context of the four SCAPE Project dimensions of scalability. It provides a different perspective on the Jpylyzer business benefits described above.

  • Number of objects
    • Quality checking huge numbers of objects manually requires considerable resources. Even manual checks of a small percentage of objects can be costly. Manual checking has been shown to be ineffective at identifying some examples of badly formed JP2s. Jpylyzer can be applied to automatically check every object passing through a digitisation or ingest workflow. It can identify badly formed JP2s as well as ensuring conformance to an organisation's JP2 profile
  • Size of objects
    • Processing large objects, and indeed large numbers of large objects, increases the potential and impact of everyday IT issues on the resulting files. Network dropouts, disk errors or capacity issues, and software bugs can all lead to the creation of damaged files. As noted above, automated checking with Jpylyzer has advantages in cost, accuracy and coverage of quality checking. It is far more effective at detecting broken or truncated files resulting from the issues mentioned, than manual checking
  • Complexity of objects
    • The JPEG2000 standard offers a complex range of options for construction of a JP2 file. A range of compression types and levels are possible, and images can be optimised for remote delivery in a number of different ways. Consequently there are many things that can go wrong when creating a JP2. Conformance checking to an institutional profile, using Jpylyzer will catch mistakes of this kind
  • Heterogeneity of collections
    • JP2 was around for many years before it's more recent adoption by memory institutions for storing digitised masters and the fixing of a number of significant preservation issues in the JPEG2000 standard and JP2 creating applications. Jpylyzer is therefore vital for identifying JP2s that exhibit these risks in deposited collections (i.e. data acquired by or deposited with a repository)

For more on developing and articulating your business benefits see the DPBCT sections on Benefits, Stakeholder analysis, Who is going to be affected? and How do I make the case for what I want to do?.

Jpylyzer cost elements

Use this to identify issues that relate to specific costs in your business case.

Jpylyzer specific:

  • Installation, testing and maintenance is likely to be straightforward:
    • Easy installation via Windows executable or Debian package
    • Jpylyzer can be used as a simple command line tool (or a Python module), making it easy to build into a preservation workflow
    • Jpylyzer has been incorporated in workflow tools such as Goobi
  • Jpylyzer is well documented
  • Some knowledge of the intricacies of the JPEG2000 standard (which is complex) is necessary to interpret Jpylyzer output and take sensible action based on it

Moving from TIFF to JP2 as a digitised master format (using Jpylyzer as a validator for created JP2s):

  • Moving to JP2 requires considerable expertise with the JPEG2000 standard
    • Developing an appropriate profile requires in depth JPEG2000 understanding
    • Deciding on target file sizes and levels of compression requires detailed consideration and assessment, both from a technical perspective and from a curatorial perspective (what level of image loss is tolerable if lossy compression is used)
  • Evaluation of suitable JP2 creation software will need to be performed
  • Delivering content in JP2 format requires server side delivery capabilities

For more on identifying and understanding the costs of your business activity see the DPBCT sections on Costs, Institutional readiness and What resources are we focussing on?.

Jpylyzer digital preservation risks

Use this to understand the key digital preservation risks relating to JP2 files that Jpylyzer can identify and help to mitigate.

Jpylyzer provides a mechanism to identify a number of preservation risks relating to badly constructed JP2 files. Research and experimentation has indicated that issues with the JPEG2000 standard, issues with JP2 creating software and other miscellaneous IT problems can lead to creation of JP2 files containing identified preservation risks. These include, missing/incorrect resolution metadata, missing/incorrect colour profiles, issues with format identification and various forms of byte corruption that can be difficult to catch with manual checking. Note that application of Jpylyzer will not mitigate all preservation risks, such as concerns about the lack of a performant open source decoder.

Jpylyzer also provides a mechanism to check that a JP2 file conforms to an institutional JP2 profile, that defines such details as compression type and level, and nature of optimisation for remote delivery. This is particularly useful when an organisation is creating JP2 files, typically as digitised masters. While not considered a direct preservation risk, files that do not conform to an institutional profile could course preservation issues at a later date. Exceptions of this kind could trigger costly management activities at a later date. Where files subsequently need to be fixed or altered, preservation risk can be increased considerably.

For more on identifying and understanding the digital preservation risks that your business activity is targeting see the DPBCT sections on Digital preservation risks, Understand your collection and Why are we writing a business case?.

For more on developing preservation policies see the SCAPE Project Catalogue of Preservation Policy Elements.

Jpylyzer implementation risks

Use this to identify issues that relate to specific implementation risks in your business case.

This section provides a list of Jpylyzer specific issues for consideration when developing an assessment of implementation risks for a Jpylyzer related business activity. Many risks will be dependent on project or organisation specific factors (such as staffing and expertise) which are not possible to address in this template. However these issues may be useful in identifying risks that should be considered.

  • Jpylyzer is comprehensive but does not parse/validate every aspect of a JP2
  • Jpylyzer has been developed primarily by a single developer, but is now supported by OPF and has seen production use
  • Application of Jpylyzer does not mitigate all identified JP2 related preservation risks. Choosing JP2 for storing digitised masters is a trade off between cost, risk and quality
  • Jpylyzer does not provide across the board quality assurance for a JP2 workflow, but does fulfill an important role within such a workflow. Jpylyzer should be operated in conjunction with sampled manual checks and other automated checks (such as image comparison tools like Matchbox).

Jpylyzer in use

The Wellcome Library was an early adopter of Jpylyzer. Dave Thompson, Digital Curator, describes how The Wellcome uses Jpylyzer and what value they get from it.

"We use it in two ways; firstly to automatically validate JP2 images created for us by external agencies and secondly to validate JP2 images we have created ourselves. Building the Jpylyzer into our workflow as an automated validation tool allows us to validate the technical properties of every single JP2 image that we process. A level of validation that we’d never be able to achieve if it were performed by a human.

We take JP2 images from a variety of sources, external and internal, and we need to be sure that the images we are ingesting into our digital object repository are consistent and uniform. Using the Jpylyzer is part of our ambition for high levels of automation that allow us to achieve high levels of throughput whilst maintaining a level of consistency in our images. This will be important later on when we need to perform preservation actions on our JP2 images, as consistency of the images will reduce the complexity of the actions we need to take."

For more on the relevance and value of understanding the context to your organisation's business case, see the DPBCT section on External context.

Jpylyzer business case example

This example business case applies the Jpylyzer benefits and risks (see above) to a particular (theoretical) organisational situation. It shows how they could be tailored to the needs of an organisation and the likely concerns and interests of stakeholders. It comprises key sections from the DPBCT Template for building a business case followed by explanatory discussion notes.

Analyse web.png

Jpylyzer executive summary

An example executive summary.

Mass digitisation projects generate millions of master images that must be stored in multiple locations to ensure their longevity. At this scale, storage costs (even in the short term) are considerable. JPEG2000 technology offers the potential to significantly reduce the size of digitised masters for a negligible loss of quality. Using the JP2 file format to store digitised masters therefore provides an attractive alternative to the conventional choice of TIFF. There are however considerable digital preservation concerns about JP2, which could put the longevity of digitised collections at risk. This business activity will put in place a quality assurance process that will validate JP2 masters, mitigate preservation risks associated with their usage, and as a result enable considerable storage cost savings. By reducing initial storage costs by around 60%, resources will be freed up for the digitisation of one million additional pages.

Discussion

Discussion notes explaining the approach in developing the Executive Summary example, above.

The summary above explains the context and current situation before describing the change to business processes that will be implemented. It focuses on the key elements for a business case of this kind: benefits, costs and risks. The text uses some technical language (mention of JP2 and TIFF) and this may be deemed too technical for the audience, in which case it may be better to refer to the technologies in general terms without specifically naming them.

For more on summarising your business case and delivering your key messages succinctly see the DPBCT sections on Executive summary and How do I make the case for what I want to do?.

Jpylyzer business activity

An example Business Activity description.

The existing digitisation workflow for mass digitisation projects at this organisation migrates camera raw images from the digitisation studio to TIFF files for storage as master images in the long term digital respository. The proposed business activity will instead migrate camera raw images to JP2, and then validate the construction of these files using Jpylyzer. The main activities are:

  • Develop appropriate JP2 profile and compression level targets
  • Assess, select and test JP2 migration software
  • Implement and test workflow to migrate and validate JP2s
  • Large scale trial run

Discussion

Discussion notes explaining the approach in developing the Business Activity example, above.

Moving from TIFF to JP2 for the storage of digitised masters will not suit every collection or organisation. Benefits can include reduced storage cost (particularly if lossy compression is utilised) and improved user experience (if images are optimised and delivered to browser by an appropriate system). However, these benefits introduce preservation risks (some which are not mitigated by the use of Jpylyzer in an assessment workflow). Lossy compression lowers the quality of the stored images. The trade off between, cost and risk will be different in different situations. Making an appropriate choice for the collection and organisation in question is essential, and should be guided by collection needs, institutional policy, a careful evaluation of the alternatives, practical testing of software with real data, and a clear appreciation of the risks involved.

For more on developing and articulating your business activity see the DPBCT sections on Business activity, How do I make the case for what I want to do? and Why are we writing a business case?.

Jpylyzer benefits

An example of business benefits.

Implementation of a quality assurance process for digitised masters, based on the application of Jpylyzer, will facilitate a move from TIFF to JP2 for the storage of digitised masters from mass digitisation (1million+ pages) projects. It will generate the following benefits:

  • A reduction in storage costs of 80%, enabling a further 1 million pages to be digitised
    • Supporting key organisational objective 3: "Significantly increase access to the collections by implementing digitisation of at least 5 million pages of our collection"
  • Mitigation of key long term digital preservation risks
    • Supporting key organisational objective 2: "Preserve and provide access to our collections for future generations of researchers"
  • Improved quality of digitisation results and improved efficiency of their generation via the use of cutting edge quality assurance technology
    • Supporting key organisational objective 5: "Deliver organisational change, through the use of new techniques and technology, enhancing our reputation by doing more with less"

Discussion

Discussion notes explaining the approach in developing the Business Benefits example, above.

This example benefits section distils a range of possible benefits down to three really important issues for this organisation, as shown by close alignment with the organisational objectives. Reduce costs/do more, ensure preservation, and improve quality/efficiency. The benefits in this case are pitched in the context of the wider operation of moving from the generation of TIFF to JP2 masters, rather than just the specifics of the preservation issues. Benefits tied primarily to long term preservation will often also be linked to quality and cost. Improved quality tends to result from better managed and validated processes that follow a sensible policy. Depending on the stakeholder(s) in question this might be viewed as more significant and immediate than preservation benefits, as hence be considered more favourably.

A reduction in costs could simply save money from the storage and preservation budget, or it could expressly be used to enable more digitisation (more realistic in practice if digitisation, preservation and initial storage are all initially covered by the same capital investment). This could be crucial in making a strong case to the most important stakeholders. In this example, freeing up resources to enable extra digitisation aligns well with a crucial organisational objective.

For more on developing and articulating your business benefits see the DPBCT sections on Benefits, Stakeholder analysis and Who is going to be affected?.

Jpylyzer digital preservation risks

An example of digital preservation risks relevant to a specific business case.

A number of a digital preservation risks associated with the JP2 format were identified by early adopters of this technology. Without the application of suitable mitigation there is considerable concern for the longevity of material stored in this format. Software applications that generate JP2's have shown some degree of unreliability. Processing large volumes of data can push generating software (and other workflow processes) to the edge. Results can be invalid or badly formed JP2 files, or possibly even truncated files. The JP2 format presents a vast array of options for the design of a particular JP2 file. JP2's can be optimised for delivery and access in a variety of ways and the type and level of lossy compression is crucial in ensuring appropriate file size and quality levels. These policy choices are defined in the organisational JP2 profile. If JP2's are generated that do not meet this profile, management of and access to the files may become problematic.

This image, courtesy of the British Library provides an example of an arbitrarily truncated JP2, created by a faulty workflow process at the British Library. This example was one of many used to test developments in Jpylyzer.

Discussion

Discussion notes explaining the approach in developing the Digital Preservation Risks example, above.

  • A strong business case results from the use of references to real examples of relevant preservation risks.
  • The use of images can be a powerful communication tool.
  • There should be a clear link from the Business activity to the risks outlined in this section, in this case the focus is on JP2, the format that Jpylyzer (which forms a central part of the business activity) validates.

For more on identifying and understanding the digital preservation risks that your business activity is targeting see the DPBCT sections on Digital preservation risks, Understand your collection and Why are we writing a business case?.

Acknowledgements

This business case template was created by the SCAPE Project with the support of the European Union under FP7 ICT-2009.4.1

Scape logo.png