As digital collections increase in volume and sophistication, tools to manage and sometimes automate their preservation must accordingly become more sophisticated, too. But since preservation actions are often quite idiosyncratic to the institutions in which they take place, there is still often a frustrating gap between the digital preservation toolkit and its deployment in varying circumstances.
Addressing this gap was the impetus for a recent “mashup” digital preservation hack-a-thon event held in the United Kingdom. Dr. Cal Lee and Dr. Kam Woods from the School of Information and Library Science (SILS) were among those who attended in the hope of uniting developers and digital collections stakeholders to come up with innovative solutions for digital preservation.
“Practical Tools for Digital Preservation: a Hack-a-thon” was held from September 27-29 at the York Science Park on the campus of the University of York in northeastern England. The idea of the event was to match up, or “mashup,” professionals who have practical preservation problems with their collections with developers who would try to help them find ways to approach those challenges.
Cal Lee was the collection owner in a pairing with Swithun Crowe, Developer for Arts and Humanities Computer Projects at the University of St. Andrews in Scotland. According to a blog post Crowe wrote for the Open Planets Foundation about the event, Lee’s collection consisted of realistic disk images – images of Windows PC hard drives which had been used in a realistic way by students, who for several weeks pretended to be working for a fictitious company and generated documents, application settings, and user information on the disks in the process.
In the course of the two-day hack-a-thon, Crowe developed a partial solution for Lee’s disk image preservation issue. Crowe made a web interface that would allow users to choose a disk image and then navigate through the directories in the file system and see their contents. Though he did not implement a complete solution in the short time allotted, Crowe writes that his work with Lee has “helped to refine the questions that need answering.”
Kam Woods, a Postdoctoral Research Associate at SILS, took the development role in a mashup with Jenny Mitcham, a curatorial officer from the Archaeology Data Service, which is headquartered at the University of York. Mitcham’s issue was the identification of images which were embedded in Microsoft Office (.docx) files. Over the two days of the hack-a-thon, Woods developed docXtractor, an application which pulls out the images and associated metadata from .docx files and stores them in external directories for each document. The application also reflects contextual associations between the original files and the images as they were originally embedded in the documents.
At the end of the hack-a-thon, Woods’ application was voted as the best solution of the event by the participants, a proud distinction given that the participants are at the forefront of digital preservation in the UK. “Practical Tools for Digital Preservation” was co-sponsored by the Open Planets Foundation (OPF) and the Digital Preservation Coalition (DPC). OPF is headquartered in London, and was established to provide practical solutions in digital preservation based on the results of the Planets project, a European Union project aimed at “ensuring long-term access to digital cultural and scientific assets.” DPC, who hosted the hack-a-thon at their offices in the York Science Park, was established “to foster joint action to address the urgent challenges of securing the preservation of digital resources in the UK.”
Photo credit for the top photo: Open Planet Foundation Labs. Kam Woods is standing on the far left in the back row.