BitCurator NLP to host day-long event focusing on natural language processing applications for primary sources

December 13, 2016

BitCurator NLP will host “nlp4arc - Enabling New Forms of Access to Primary Sources through Natural Language Processing” February 3, 2017, from 9 a.m.-5 p.m. at the Carolina Union, Room 3206, on the campus of UNC Chapel Hill.

The event will focus on the application of natural language processing (NLP) to support use, access, and analysis of digital primary source materials. Click here to register.

The day will include a series of talks by internationally-recognized experts, followed by a set of participant-driven unconference discussions. Speakers will include:

  • Mary Elings, University of California, Berkeley
  • Christopher (Cal) Lee, University of North Carolina at Chapel Hill
  • Mark Matienzo, Stanford University Libraries
  • Don Mennerich, New York University
  • Daniel Pitti, University of Virginia
  • Josh Schneider, Stanford University Libraries
  • Ryan Shaw, University of North Carolina at Chapel Hill
  • Stéfan Sinclair, McGill University
  • Brad Westbrook, retired
  • Carl Wilson, Open Preservation Foundation
  • Kam Woods, University of North Carolina at Chapel Hill

A rapidly growing body of materials with significant cultural value are “born digital.” Information professionals must be prepared to extract digital materials from their original environments and media in ways that reflect the rich metadata and ensure the integrity of the materials. They must also support new forms of access: allowing users to make sense of materials and understand their context.

There are many types of contextual information that can be vital to making sense and meaningful use of digital objects.  These can include objects, agents, occurrences, purposes, times, places, form of expressions, concepts/abstractions and relationships.

There are many existing open-source tools that libraries, archives and museums (LAMs) can use to identify, extract and expose such contextual entities from the wide diversity of born-digital materials that LAMs already hold and continue to receive.  NLP tools and methods can help to both (1) facilitate curatorial decision making and description, and (2) generate access points to be presented to end users.

nlp4arc is being administered through BitCurator NLP, a project funded by the Andrew W. Mellon Foundation. Click here for a story about the new project, or visit