SILS Faculty and Doctoral Student Lead Data Curation Workshop

September 14, 2012

Three members of the SILS community recently took their expertise in digital curation on the road. Dr. Gary Marchionini, dean of SILS and Cary C. Boshamer Distinguished Professor; Dr. Cal Lee, associate professor; and Heather Bowden, doctoral student, conducted a workshop entitled “Curating for Quality: Ensuring Data Quality to Enable New Science,” in Arlington, Virginia. 

On September 10-11, Marchionini, Lee and Bowden led discussions about the increasing importance of curating research data. These data are the traces of collection, processing, transmission and use of scholarly work - for example, data that scholars collect, store in spreadsheets or databases and include in documents that report on their results.

Data Curation Workshop Participants 2012Data curation - a type of digital curation - is, they argue, the best method to approach managing these rich new sources of data from a lifecycle perspective. As Dr. Lee and another SILS faculty member, Dr. Helen Tibbo, wrote in 2007, "Digital curation involves selection and appraisal by creators and archivists; evolving provision of intellectual access; redundant storage; data transformations; and, for some materials, a commitment to long-term preservation.  Digital curation is stewardship that provides for the reproducibility and the re-use of authentic digital data and other digital assets."

At this workshop, Marchionini, Lee, and Bowden focused on defining data quality research issues and potential solutions. Over two days, they addressed topics including data quality criteria and contexts, human and institutional factors, tools for effective and painless curation, and metrics for data quality measurement and error correction. Participants included scholars and thought leaders from a variety of academic, industry and government contexts.

The workshop was supported by a grant from the National Science Foundation, for which Dr. Marchionini and Dr. Lee are co-principal investigators.

Participants included:

  •     Micah Altmann, Massachusetts Institute of Technology (MIT)
  •     Kevin Ashley, Digital Curation Centre (DCC)
  •     Heather Bowden, University of North Carolina at Chapel Hill
  •     Jackie Bronicki, University of Michigan Library
  •     Ruth Duerr, National Snow and Ice Data Center (NSIDC)
  •     Ricky Erway, Online Computer Library Center (OCLC)
  •     Andrew Fiore, Facebook/UC Berkeley
  •     Michael Giarlo, Pennsylvania State University
  •     Alan Hall, National Climatic Data Center (NCDC)
  •     Leslie Johnston, Library of Congress
  •     Cal Lee, University of North Carolina at Chapel Hill
  •     Michael Lesk, Rutgers University
  •     Gary Marchionini, University of North Carolina at Chapel Hill
  •     Matt Mayernick, National Center for Atmospheric Research (NCAR)
  •     Jerome McDonough, University of Illinois at Urbana-Champaign
  •     Prasenjit Mitra, Pennsylvania State University
  •     Reagan Moore, University of North Carolina at Chapel Hill
  •     Michael Nelson, Old Dominion University
  •     Andy Rauber, Vienna University of Technology (TUWEIN)
  •     Caitlin Sticco, National Library of Medicine (NLM)
  •     Jamie Taylor, Google
  •     Kris Tolle, Microsoft
  •     Doug White, National Institute of Standards and Technology (NIST)
  •     Jasmine Young, Research Collaboratory for Structural Bioinformatics Protein Databank (RCSB PDB)