UNC School of Information and Library Science (SILS) faculty members David Gotz and Yue “Ray” Wang have been awarded over $200,000 in grant funding to develop a system that will allow users to bring unstructured contextual information into structured data visualizations.
The year-long project, “Visual Data Exploration for Integrated Structured/Unstructured Analysis,” is part of a collaboration between Carolina and the Laboratory for Analytic Sciences at NC State University. The research is supported by funding from the U.S. Department of Defense.
Analysts often need to synthesize information found in both structured and unstructured data sources. For instance, social scientists may study both government data and social media posts; cyber intelligence analysts review event logs and textual reports; physicians and medical researchers examine health records and medical literature.
Discoveries and insights identified through one resource usually serve as the context for subsequent analysis within the other, but because current systems use separate tools to manage structured and unstructured data, analysts must transfer these contexts mentally.
Gotz and Wang believe that an ideal system should offer a unified interface for examining and analyzing data in both forms, and that a consistent analysis context should be shared across resources to facilitate a fully-integrated visual data exploration.
They aim to make progress toward this goal with their current project. While their work will be applicable across domains, they plan to develop their initial prototypes and algorithms using medical data, specifically electronic data for roughly 30,000 patients from the UNC Clinical Data Warehouse and 29 million scientific abstracts from PubMed and over 300,000 clinical trial descriptions from ClinicalTrials.gov.
“We already have access to these datasets and have successfully utilized this data in prior work,” Wang said. “We hope to demonstrate how a health analyst who has identified poor outcomes in a patient population that has undergone specific treatments can interactively merge patient data with related documents from medical literature that can either provide support for the discovery or suggest alternative explanations.”
An associate professor at SILS, Gotz also serves as Assistant Director of the Carolina Health Informatics Program (CHIP) and Director of the Visual Analysis and Communication Laboratory (VACLab). Wang is an assistant professor at SILS whose research focuses on developing principled interactive machine learning algorithms to minimize data scientists' efforts in producing high-quality results.