September 28, 2011

Dr. Reagan MooreThe University of North Carolina at Chapel Hill is leading a new effort to address key data challenges facing scientific researchers in the digital age.
The National Science Foundation has awarded nearly $8 million over five years to the DataNet Federation Consortium, a group that spans seven universities, to build and deploy a prototype national data management infrastructure. About half the award will support research and development at UNC.

The consortium will address the data management needs of six science and engineering disciplines: oceanography, hydrology, engineering design, plant biology, cognitive science and social science.

The infrastructure project will support collaborative multidisciplinary research through shared collections and archives and data publication within digital libraries.

RajaThe Data Intensive Cyber Environments research group in UNC’s School of Information and Library Science leads the consortium. The Renaissance Computing Institute at UNC-Chapel Hill is responsible for federating the consortium’s diverse data repositories to enable cross-disciplinary research. Federating data involves tasks such as providing a common access interface and developing data management policies.

The DFC will useiRODS, the integrated Rule Oriented Data System, to implement a policy-based data management infrastructure. iRODS, developed by UNC’s DICE Center and DICE researchers at the University of California at San Diego, enforces policies as computer actionable rules to organize distributed data into sharable collections. Procedures to automate data management functions are cast as computer executable workflows. Policies control data access, sharing and archiving. Research groups worldwide, including the NASA Center for Climate Simulations, the National Optical Astronomy Observatory, the Australian Research Collaboration Service and the Texas Digital Libraries, use iRODS technology to manage their research data grids, implement digital libraries and build persistent archives.

“Excelling in the digital age requires that scientific disciplines and government agencies have the ability to manage the enormous amount of data that are generated each day,” said Barbara Entwisle, UNC’s vice chancellor for research. “Scientists can only solve the important problems of our times if they can easily access, share, analyze and preserve data for future researchers and students. This award is important beyond its dollar amount because it establishes Carolina as the leader in the worldwide research community in taming the data deluge and as the data federation hub for collaborative research. It’s a role that is essential for future discoveries and innovations.”

UNC experts will work with six National Science Foundation-supported consortia that will use the new data infrastructure. They are:

• The Ocean Observatories Initiative (OOI), a program led by the University of California at San Diego and the Scripps Institution of Oceanography, which uses data from environmental sensors to study the ocean and seafloor.
• The Consortium of Universities for Advancement of Hydrologic Science Inc., a University of South Carolina-led organization that works to advance water science.
• Cyber-Infrastructure-Based Engineering Repositories for Undergraduates, an initiative led by Drexel University, which uses digital design repositories to enhance engineering instruction and learning.
• The iPlant Collaborative, a University of Arizona-led project developing an integrated cyberinfrastructure to advance studies of plant biology.
• The UNC Odum Institute for Research in Social Science, which focuses on teaching and research in the social sciences.
• The Temporal Dynamics of Learning Center, based at the University of California at San Diego, which studies the role of time and timing in learning to improve educational practices.

At Arizona State University, consortium researchers will collaborate on policy-based data management systems. Duke University researchers will develop education and outreach initiatives to broaden the consortium’s impact.

“We see this as the first step to building a data infrastructure that will accommodate collaborative research, new educational approaches and innovative problem solving in academic institutions, in federal agencies and across national boundaries,” said Reagan Moore, Ph.D., the consortium’s principal investigator and School of Information and Library Science professor and scientist with the Renaissance Computing Institute. “The infrastructure we develop will address all stages in the community-based data collection lifecycle, from initial collection formation for a single project, to shared collections across institutions, to formation of data processing pipelines, to publication and long term preservation.”

Co-principal investigators of the project include: Dr. Arcot Rajasekar, SILS professor and research scientist at RENCI; Dr. John Orcutt, OOI; Dr. William C. Regli, Ciber-U, Drexel University; and Dr. Jonathan Goodall, CUAHSI, University of South Carolina.

tibbo  Cal Lee 

Photos from left to right, Dr. Helen Tibbo, Dr. Richard Marciano and Dr. Christopher Lee

SILS faculty are a critical part of the team determining information needs of the science partners for the development of the iRods system included in the DataNet project. They include Dr. Helen Tibbo, alumni distinguished professor, and Dr. Christopher (Cal) Lee, associate professor, senior personnel on the project who will lead the Policy and Standards Community of Practice, charged with identifying the practices and needs of the scientific groups, and identify associated requirements for policies and processes.  Dr. Richard Marciano, professor, leads development of sustainability mechanisms for data collections.

During the first 18 months of the grant, the consortium will focus on federating the data management cyberinfrastructure for the OOI, CUASHI and CIBER-U. The work will include identifying federation requirements, integrating existing data management systems, deploying a federation hub and developing policies and procedures for data sharing so that the data collections of these research communities can become the foundation of a national data cyberinfrastructure.

