Digital Forensics with Dr. Christopher (Cal) Lee

September 1, 2011

Dr. Christopher (Cal) Lee is not a police officer, nor is he a lawyer or in the criminal justice field, but he is an information practitioner. He studies and helps others to understand digital forensics – defined by Rodney McKemmish as “the process of identifying, preserving, analysing and presenting digital evidence in a manner that is legally acceptable.” *

“While the application of forensics techniques inherited from law enforcement and computer security in a cultural heritage setting may seem odd or counter-intuitive, in fact, as Cal has repeatedly demonstrated through his pioneering work, the combination is not only natural, but also mutually enriching,” said Matthew G. Kirschenbaum, associate professor, Department of English and associate director, Maryland Institute for Technology in the Humanities at the University of Maryland. “Libraries, special collections, and other collecting institutions increasingly receive computer storage media (and sometimes entire computers) as part of their acquisition of ’papers’ from contemporary artists, writers, musicians, government officials, politicians, scholars, scientists and other public figures. Smart phones, e-book readers, and other data-rich devices will surely follow. The same forensics software that indexes a criminal suspect’s hard drive allows the archivist to prepare a comprehensive manifest of the electronic files a donor has turned over for accession; the same hardware that allows the forensics investigator to create an algorithmically authenticated ’image’ of a file system allows the archivist to ensure the integrity of digital content once captured from its source media; the same data-recovery procedures that allow the specialist to discover, recover, and present as trial evidence an ’erased’ file may allow a scholar to reconstruct a lost or inadvertently deleted version of an electronic manuscript—and do so with enough confidence to stake reputation and career.”

Lee says that the traditional and predominant contexts of application in digital forensics are criminal investigations; however, these are not the contexts that drive his research and teaching. Instead, he works in a space that he’s calling “forensication” -- the incorporation of digital forensics methods, tools and concepts in contexts other than criminal investigations. In particular, he is enabling information professionals to apply digital forensics principles, methods and tools to the acquisition, management, preservation and dissemination of archival materials such as government records and personal collections.

An entire industry has developed around issues of lifting data off of physical media, which is also a fundamental issue for institutions that are acquiring digital collections. There are now many companies offering data recovery services. Professionalization of this work has included the establishment of specialized conferences and journals. There are also formal education programs
such as those offered by the National Center for Forensic Science - computer forensic proficiency test since 2007, Master of Science in Digital Forensics since Spring 2008, and both undergraduate and graduate education offerings at a variety of institutions.

“In 2002, the IETF issued ’Guidelines for Evidence Collection and Archiving,’ which signaled attention to these issues among developers, engineers and administrators,” said Lee. “There are many notable overlaps with archives literature including evidential value, chain of custody and documentation of acquisition activities. A landmark in the connection of these issues to digital curation was a report in 1999 called “Digital Archaeology: Rescuing Neglected and Damaged Data Resources” by Seamus Ross and Ann Gow. Most recently, a project funded by the Mellon Foundation called “Computer Forensics and Born-Digital Content in Cultural Heritage Collections” hosted a symposium and generated a report, which have provided significant contributions to this
discussion.” Lee served on the Advisory Board for the project, participated in the symposium and contributed to its final report.

Lee says that materials with significant cultural value are now predominantly “born digital,” and collecting institutions have great opportunities to acquire and preserve resources that document a wide array of human activities.

In order to seize these opportunities, information professionals must be prepared to extract digital materials from removable media in ways that reflect the rich metadata and ensure the integrity of the materials. They must also support and mediate appropriate access: allowing users to make sense of materials and understand their context, while also preventing inadvertent disclosure of sensitive data.

He notes that there are already many cases of self-contained Linux-based packages that bundle many of the tools in order to support digital forensics activities. However, they are not likely to be very approachable to library/archives professionals in terms of interface and documentation. He says even more importantly, are two fundamental needs for collecting institutions that are not addressed by software designed for the digital forensics industry: (1) Incorporation into the workflow of archives/library ingest and collection management, e.g. metadata conventions, hooks into existing collection management systems.
(2) Provision of public access to the data. The typical digital forensics scenario is a criminal investigation in which the public never gets access to the evidence that was seized. By contrast, collecting institutions that are creating disk images face issues of how to provide access to the data. This includes access interface issues, but also how to redact or restrict access to components of the image, based on confidentiality, intellectual property or other sensitivities.

Lee’s research has been attempting to meet these needs.

Lee, along with Kam Woods, postdoctoral fellow at SILS, is currently administering “Curation of a Forensic Data Collection for Education,” which is a sub-grant of a National Science Foundation-funded project led by Simson Garfinkel of the Naval Postgraduate School.

“We are developing and enhancing a collection of data that represents realistic scenarios (e.g. traces of computer use that span many years, multiple disk images that are relevant to the same matter, numerous end-user applications installed and used), while also being appropriate for students to use in support of digital forensics education,” said Lee. “Primary focus areas are annotation, scenarios, exercises, answer keys and other forms of data that can further enhance access and use of the disk images. We are also investigating strategies for ensuring that the data sets will remain available and useful beyond the life of the project.”

Lee, Garfinkel and Woods are investigating a variety of issues, system requirements and technical strategies for the management and long-term preservation of forensically-acquired data. They recently published a paper about this work called “Extending Digital Repository Architectures to Support Disk Image Preservation and Access” in the Proceedings of the Joint Conference
on Digital Libraries.

In association with DigCCurr 2009, the digital curation curriculum symposium held April 1-3, 2009 in Chapel Hill, NC, Lee organized a symposium with Richard Szary (UNC at CH) and Tom Hyry (then at Yale University) called “Stewardship of E-Manuscripts: Advancing a Shared Agenda.” An invited set of leaders in this field – from Australia, Austria, the United Kingdom and the United States – shared strategies for and experience with the selection, acquisition, arrangement, description, preservation and access to personal materials in digital form; this included participation of numerous leading experts on the application of digital forensics techniques to the acquisition of digital collections.

From June 2010 to June 2011, Lee led a project called the Digital Acquisition Learning Laboratory (DALL), which was funded by the Andrew W. Mellon Foundation. He has overseen the installation and setup of the digital forensics hardware and software to be used in support of both course work (offered to undergraduate and graduate students) at SILS and continuing professional
education offerings. DALL has allowed Lee to participate in several intensive digital forensics courses. In the Spring 2011 semester, he administered an integrated set of exercises within a course (Electronic Records Management), which he team-taught with fellow SILS faculty member Richard Marciano. Lee and Woods also developed and administered a new special-topics course at SILS called “Acquiring Information from Digital Storage Media,” which introduced *Source: McKemmish, R. “What is Forensic Computing?” Trends and Issues in Crime and Criminal Justice 118 (1999).

students to digital forensics concepts and methods, using both commercial and open-source software. In addition to building significant hardware, software and procedural capacity for future digital forensics teaching and research, the DALL project has provided SILS personnel with valuable practical lessons about the likely opportunities and challenges of information professionals using digital forensics tools.

“Through a series of professional papers, grants, and workshops conducted at major international conferences, Cal Lee has provided both technical and conceptual guidance as archivists struggle to assimilate these powerful but alien and even foreboding tools and methodologies into their daily workflows,” said Kirschenbaum.

When asked where he thought this technology was headed, Lee said, “My vision is for widespread incorporation of the right bits and pieces of forensics methods into routine processing of acquisitions by collecting institutions.”

About Cal Lee
Christopher (CAL) Lee is an associate professor at the School of Information and Library Science at the University of North Carolina at Chapel Hill. He teaches courses on archival administration; records management; digital curation; understanding information technology for managing digital collections; and the construction of policies and rules for digital repositories. He also teaches half-day and full-day professional workshops on the application of digital forensics methods and principles to digital acquisitions. He is one of the lead organizers and instructors for the DigCCurr Professional Institute, which is a week-long continuing education workshop on digital curation
that draws participants from around the world.

Lee’s primary area of research is the long-term curation of digital collections. He is particularly interested in the professionalization of this work and the diffusion of existing tools and methods (e.g. digital forensics, Web archiving, automated implementation of policies) into professional practice.

He recently edited and provided several chapters to the book entitled, I, Digital: Personal Collections in the Digital Era, published by the Society of American Archivists. He has served as the principal investigator of the two projects discussed above: Digital Acquisition Learning Laboratory (DALL) and “Curation of a Forensic Data Collection for Education” (sub-grant of NSF Award DUE-0919593). He has also served as co-PI with Dr. Helen Tibbo, alumni distinguished professor at SILS on several projects funded by the Institute for Museum and Library Services (IMLS): “Closing the Digital Curation Gap” (CDCG); “DigCCurr II: Extending an International Digital Curation Curriculum to Doctoral Students and Practitioners;” “Educating Stewards of Public Information in the 21st Century (ESOPI-21);” “Educating Stewards of the Public Information Infrastructure (ESOPI2);” and “Preserving Access to Our Digital Future: Building an International Digital Curation Curriculum & the Carolina Digital Curation Fellowship Program (DigCCurr).”