From the project website:
- "Software underpins the creation and analysis of research data, some of this is standard commercial software or large scale open source projects; however much is individually produced and tweaked through the research process by those who are domain experts not necessarily computer scientists.
- This project is considering issues of software re-use and identification. It will start by considering the issues pertaining to persistent identification of software and how particular pieces of computational research software may not only be identified but kept in a runnable state"
- "Key project outputs:
- Report: Guidelines for persistently identifying software using DataCite
- Report: Phase II Report on Technical Progress and Case Study
- Landing page system and ‘Play it’ button demonstrations"
From a synthesis of the projects in the context of the OAIS model, by Jen Mitcham of the Filling the Digital Preservation Gap project
- "A project with a key focus on access is “Software Reuse, Repurposing and Reproducibility” from the University of St Andrews. However, as is the case for many of these projects, it also touches on other areas of the model. At the end of the day, access isn't sustainable without preservation so the project team are also thinking more broadly about these issues.
- This project is looking at software that is created through research (the software that much research data actually depends on). What happens to software written by researchers, or created through projects when the person who was maintaining it leaves? How do people who want to reuse the data get hold of the right software? The project team have been looking at how you assign identifiers to software, how you capture software in such a way to make it usable in the future and how you then make that software accessible. Versioning is also a key concern in this area - different versions of software may need to be maintained with their own unique identifiers in order to allow future users of the data to replicate the results of a particular study. Issues around the preservation of and access to software are a bit of a hot topic in the digital preservation world so it is great to see an RDS project looking specifically at this."
Hyperlinks to further information on the project
- Phase 2 report, detailing work done as of December 2015
- Jisc Project Podcast
- Recompute on Github
Potential to enhance
Is there potential to leverage non-preservation focused developments to enhance preservation capabilities?
This project already aims to develop preservation capabilities--long-term preservation through assigning persistent identifiers, maintaining the relationship between content and software, and breaking down the components of ‘software’ to ensure a broad range of objects can be cited. To further the preservation capabilities developed through this project, the researchers might aim to produce an outcome that generalises the processes developed for assigning DataCite DOIs to this type of bespoke research software so that the processes could be applied to other types of software, such as commercial software. These processes could benefit heritage and government sector organisations looking to perform similar versioning and PID assignments on their own software.
It's worth noting that David Rosenthal, in his 2015 Andrew Mellon Foundation report, identified one of the two remaining barriers to deployment of emulation as a preservation strategy as "the tools for creating preserved system images are inadequate", a challenge that this work attempts to solve and indeed automate. The process of generating the "playable" image has a useful side-effect of ensuring that the source code is easily built. In effect this gives Software Reuse an impressive double pronged approach to the long term preservation of software.
Is there potential for collaboration and/or exploiting existing/parallel work beyond the project consortiums?
The development of processes and methods for assigning PIDs to software could be aligned with the software preservation activities of the Open Preservation Foundation who support a broader range of sectors. The work could also be aligned with that of the Software Sustainability Institute. The range of contact points detailed in the phase 2 report is very encouraging. Additions to the list would be at the access end of image encapsulation delivery with the Olive Archive and the bwFLA. The work of both is summarised well in the previously mentioned Rosenthal report (page 6+).
Considerations going forward
What are the key considerations (with regard to preservation) for taking forward the work beyond the current phase?
The project has already developed strong approaches to preservation through documentation, metadata recommendations and the three main outputs including guidelines and case studies. Continuing to engage with the end users of these methods and processes will strengthen the outputs and also support sustainability. Collaborating with both researchers and the practitioners responsible for archiving this software in a repository will also help the project further develop preservation capabilities into the next phase.
Uptake and sustainability
What steps should be taken to ensure effective uptake and sustainability of the work within the digital preservation community?
Further dissemination of the outputs of this project, including simple, straightforward demonstrations at relevant events will help establish the assignment of PIDs to research software in future. Broader applicability of core project developments offers potential to open up new sources of funding from other sectors.
Project website sustainability checklist
A brief checklist ensuring the project work can be understood and reused by others in the future.
|Clear project summary on one page, hyperlink heavy||1|
|Project start/end dates||0|
|Clear licensing details for reuse||1|
|Clear contact details||1|
|Source code online and referenced from website||1|
2=present, 1=partial, 0=missing
- Potential: Wide applicability of underlying developments could benefit broad preservation sector (and beyond), and potentially open up further sources of funding
- Would benefit from further dissemination in the broader digital preservation community
- Suggest adding key details to the project website, including start/end dates, licensing details and references to source code on Github.