sils home | site map | Searchcontact us | Searchsearch
RESEARCH

TR-2009-01

Marchionini, Gary; Tibbo, Helen; Lee, Cal A.; Jones, Paul; Capra, Robert; Geisler, Gary; Russell, Terrell; Shah, Chirag; Sheble, Laura*; Jorda, Sarah; Song, Yaxiao; Howard, Dawne E.; Clemens, Rachael; and Hill, Brenn.

*Thank you, Laura Sheble for crafting this final report from four years of VidArch Project work.

VidArch: Preserving Video Objects and Context Final Report

ABSTRACT
Video is becoming increasingly important to digital libraries and archives, both as a primary content type and as context for other collection objects. Videos included in collections may be works unto themselves; documentary evidence of people, places, and events core to a collection mission; or documentary evidence for primary objects in a collection. Recognizing these roles of video in popular and scholarly culture, the Library of Congress included the VidArch project in its NDIIPP research portfolio. The VidArch project contributed to the further development of policies and tools to facilitate the preservation of digital video from the WWW through an examination of video not as isolated information objects, but as information-rich multi-sensory elements embedded in an equally information-rich use environment. We explored the meaning of context as it relates to video from internal, external, and life cycle-based perspectives: On one hand, we performed video content analysis and conceptualized a multi-faceted understanding of context based on the life cycle of video production, delivery, and use. On the other, we explored relationships between video and other elements of the networked online environment. On an implementation level, we evaluated the use of finding aids and documentation of contextual information for controlled video collections; explored the use of a collaborative online environment as an extension of the concept of the finding aid; developed tools to mine contextual elements for online video from the WWW; and implemented the robust preservation-compatible iRODS framework at the collection level for primary objects and related contextual information entities. The VidArch Project was supported by a grant from the National Science Foundation (#IIS 0455970 DigArch Program) as one of the NDIIP research projects; and by a follow up contract from the Library of Congress as a part of the National Digital Information Preservation Program. While based at the University of North Carolina, Chapel Hill School of Information and Library Science, the Project was enriched by valuable contributions from our partners: the Association for Computing Machinery (ACM); iBiblio; the Internet Archive; the National Aeronautics and Space Administration (NASA); and the San Diego Super Computer Center Data Intensive Cyber Environments (DICE) team.

TR-2008-03

Weimao Ke, Cassidy R. Sugimoto, and Javed Mostafa
Laboratory of Applied Informatics Research

Dynamicity vs. Effectiveness: A User Study of a Clustering Algorithm for Scatter/Gather

ABSTRACT
We proposed and implemented a novel clustering algorithm called LAIR2, which has linear worst-case time complexity and constant running time average for on-the-fly Scatter/Gather browsing [4]. Our previous experiments showed that when running on a single processor, the LAIR2 on-line clustering algorithm was several hundred times faster than the parallel Buckshot algorithm running on multiple processors [11]. This paper reports on a study that examined the effectiveness of the LAIR2 algorithm in terms of clustering quality and its impact on retrieval performance. We conducted a user study on 24 subjects to evaluate on-the-fly LAIR2 clustering in Scatter/Gather search tasks by com-paring its performance to the Buckshot algorithm, a classic method for Scatter/Gather browsing [4]. Results showed significant differences in terms of subjective perceptions of clustering quality. Subjects perceived that the LAIR2 algorithm produced significantly better quality clusters than the Buckshot method did. Subjects felt that it took less effort to complete the tasks with the LAIR2 system, which was more effective in helping them in the tasks. Interesting patterns also emerged from the subjects’ comments in the final open-ended questionnaire. We discuss the implications and future research.

TR-2008-02

Losee, Robert

Vocabulary Conversion: Performance with Controlled and Uncontrolled Terms and Tags

ABSTRACT

Controlled and uncontrolled indexing terminology and metadata may be converted from one to another. Decision criteria are developed that can be used to determine which terms should be assigned when converting vocabularies. Methods are developed for computing the parameters of these systems, as well as means for estimating the parameters when given limited information. These conversion techniques may be applied to thesaurus terminology, gene ontologies, topic maps, uncontrolled natural language terms, folksonomies, tags and labels on web pages, the presence or absence of a specific hyperlink, as well as to metadata. Rules are provided suggesting circumstances when controlled vocabularies are always superior to using uncontrolled vocabularies.

TR-2008-01

Kelly, Diane; Shah, Chirag; Sugimoto, Cassidy R.; Bailey, Earl W.; Clemens, Rachel A.; Irvine, Ann K.; Johnson, Nicholas A.; Ke, Weimao; Oh, Sanghee; Poljakova, Anezka; Rodriguez, Marcos A.; van Noord, Megan G.; and Zhang, Yan

Method Bias? The Effects of Performance Feedback on Users’ Evaluations of an Interactive IR System

ABSTRACT
In this study, we seek to understand how providing feedback to users about their performances with an interactive information retrieval (IIR) system impacts their evaluations of that system. Sixty subjects completed three recall-based searching tasks with an experimental IIR system and were asked to evaluate the system after each task and after finishing all three tasks. Before completing the final evaluation, three-fourths of the subjects were provided with feedback about their performances. Subjects were assigned randomly to one of four feedback conditions: a baseline condition where no feedback was provided; an actual feedback condition where subjects were provided with their real performances; and two conditions where subjects were deceived and told that they performed very well or very poorly. Results show that the type of feedback provided significantly affected subjects' system evaluations; most importantly there was a significant difference in subjects' satisfaction ratings before and after feedback was provided in the actual feedback condition. These results suggest that researchers should provide users with feedback about their performances when this information is available in order to elicit the most valid evaluation data.

TR-2007-08

Baldwin, Tim; Christodoulou, Alexandros; Gillenwater, Cary; Johnson, Nicholas; Kumar, Amit; Marchionini, Gary; Moynihan, Brian; Polczer, Gyorgy; Rodriguez, Derek; Purvis, Joshua; and VanDrimmelen, Jeff.

"Click/Talk/Touch/Look/Think Here: User Interface with Virtual Space"

Introduction: The most critical bottlenecks in information flow are human input and output (I/O). These bottlenecks are due to a combination of physiology, cognition, and technological prosthetics and are strongly exacerbated when the information flows are mediated by or with information technology. As people interact with each other or with information systems, the actions taken and the resulting information flows are outputs from the initiator's perspective and inputs from the receiver's perspective. This paper provides an overview of the I/O problem space by examining different theoretical models that are or have been considered in Human Computer Interaction (HCI) as well as summarizing different kinds of techniques and devices that are in use or in development to facilitate human information interaction in cyberspace. People sense the natural world and listen, read, and view information in the built world at differential rates ranging from a few bits per second to millions of bits per second depending on the perceptual organ. People move, talk, and write at relatively slow rates but we have created tools to change the rates. This paper presents an overview of different input devices organized by the degree to which people consciously control the devices (explicit vs implicit), considers some of the advantages and limitations of these devices and trends toward using multiple devices to facilitate natural human-computer interaction.

TR-2007-07

Capra, Robert; and Marchionini, Gary

"Visualizing Science and Engineering Indicators:
Transitioning from Print to a Hybrid World"

Introduction: This report summarizes work on the “Visualizing Science and Engineering Indicators: Transitioning from Print to a Hybrid World” project between the National Science Foundation (NSF) SEI and UNC SILS.

TR-2007-06

Liu, Yong; Mostafa, Javed and Ke, Weimao

"A Fast Online Clustering Algorithm for Scatter/Gather Browsing"

Abstract:We present a fast online clustering algorithm which has linear worst-case time complexity and constant running time average for the well-known online visually oriented browsing modeling called Scatter/Gather browsing (Cutting, Karger, Pedersen, and Tukey 1992). Our experiment shows when running on a single processor, this fast online clustering algorithm is few hundred times faster than the parallel Buckshot algorithm running on multiple processors.

TR-2007-05

By the members of the Spring 2007 Public Libraries Seminar

"American Public Library Topics - an Annotated Bibliography"

Abstract:During the spring 2007 semester at the School of Information and Library Science at the University of North Carolina at Chapel Hill, the members of the Public Libraries Seminar considered the state of the American Public Library from several aspects.

After pondering the philosophical, political, professional, and ecological contexts in which the public library exists, each of the members guided the seminar through a topic area that held special meaning for them. The result of these guided tours is the annotated subject bibliography contained in this report.

While the bibliographies are probably a full and fairly complete resource for anyone else interested in the topics discussed, the goal was not to create a dry academic resource. Rather the objective in creating the bibliographies was that the students list those resources that held particular meaning for them, and that their comments about the resources be personal, sincere, and tied to their individual concerns.

This is the second iteration of a public library bibliography and supplements the one created by the members of the spring 2005 and spring 2006 Public Libraries Seminars. The three together form a solid foundation for subsequent public library seminars to modify, add to, and enhance.

TR-2007-04

Lee, Christopher A.

"Taking Context Seriously: A Framework for Contextual Information in Digital Collections"

Abstract:Future users of digital objects will likely have numerous tools for discovering preserved digital objects relevant to their interests, but making meaningful use and sense of the digital objects will also require contextual information. This paper provides an analysis of context, distinguishing three main ways in which that term has been used within the scholarly literature. I then discuss contextual information within digital collections. I present a framework for contextual information that is based on nine classes of contextual entities: object, agent, occurrence, purpose, time, place, form of expression, concept/abstraction, and relationship. The paper then discusses existing standards and guidance documents for encoding information related to the nine classes of contextual entities, and it concludes with a discussion of potential implications for descriptive practices through the lifecycle of digital objects.

back to top

TR-2007-03

Luo, Lili

"Reference Evolution under the Influence of New Technologies"

Abstract: This report presents a historical view of library reference evolution under the influence of new information technologies. Two evolution directions were determined through a comprehensive literature review - the increase of the availability and accessibility of electronic resources and the expansion of the media through which reference services are provided.

Placing reference progression in a historical context, this article will strengthen the understanding of library reference work, and hence, lead to a more coherent development of the reference profession.

back to top

TR-2007-02

Kelly, Diane; Fu, Xin; Shah, Chirag

"Effects of Rank and Precision of Search Results on Users’ Evaluations of System Performance"

ABSTRACT
Previous research has demonstrated that system performance does not always correlate positively with user performance, and that users often assign positive evaluation scores to systems even when they are unable to complete tasks successfully. This paper investigates the relationship between actual system performance and users’ perceptions of system performance by manipulating the level of performance experienced by users and measuring users’ evaluations of system performance. Eighty-one subjects participated in one of three laboratory studies. The first two studies investigated the impact of the location (or rank order) of five relevant and five non-relevant documents in a search results list containing ten results. The third study investigated the impact of varying levels of precision (.30, .40, .50 and .60) of a search results list containing ten results. Results demonstrate statistically significant relationships between precision and subjects’ evaluations of system performance, and ranking and subjects’ evaluations of system performance. Of the two, precision explained more variance in subjects’ evaluation ratings and was a stronger predictor of subjects’ ratings. Finally, the number of documents subjects examined significantly influenced their evaluations, even when the difference was a single document.

back to top

TR-2007-01

By the members of the Spring 2006 Public Libraries Seminar

"American Public Library Topics An Annotated Bibliography"

ABSTRACT
During the spring 2006 semester at the School of Information and Library Science at the
University of North Carolina at Chapel Hill, the members of the Public Libraries Seminar considered
the state of the American Public Library from several aspects.

After pondering the philosophical, political, professional, and ecological contexts in which the
public library exists, each of the members guided the seminar through a topic area that held special
meaning for them. The result of these guided tours is the annotated subject bibliography contained
in this report.

While the bibliographies are probably a full and fairly complete resource for anyone else interested in the topics discussed, the goal was not to create a dry academic resource. Rather the objective in creating the bibliographies was that the students list those resources that held particular meaning for them, and that their comments about the resources be personal, sincere, and tied to their individual concerns.

This is the second iteration of a public library bibliography and supplements the one created by
the members of the spring 2005 Public Libraries Seminar. The two together form a solid foundation
for subsequent public library seminars to modify, add to, and enhance.

back to top

TR-2006-03

Carter, Tyson; Durbin, Dayna; and McCraw, Jenny

"Library Websites for Elementary-aged Children: A Comparative Analysis"

Abstract: What we set out to do with our project was to create a set of criteria for evaluating library websites for elementary-aged children (interpreted in this investigation as children between the ages of five and eleven), and then to apply those criteria to a small sample of sites. Thus, we started the project with three questions. First, what makes a library site for children effective and appealing, in terms of both content and design? Second, how does a select sample of library sites measure up? Third, how do three different types of library sites (public library sites for children, elementary school library media center sites, and digital or virtual libraries for children) compare? These different types of libraries serve different functions, and are often tied to different activities, and so differences in content seemed likely. These were the major concerns of the project. We set out to explore the literature on design and content selection of websites for children, and to use this literature to inform our criteria for evaluating the sites in our sample.

back to top

TR-2006-02

Boekelheide, Kristin; Brown, E. Ashley Rogers; Fu, Xin; Marchionini, Gary; Oh, Sanghee; Rogers, Gershom; Saelim, Billy; Song, Yaxiao; and Stutzman, Fred.

"Audio Surrogation for Digital Video: A Design Framework"

Abstract: This paper provides a framework to guide audio surrogation research and development. It is meant to help system designers identify which kinds of audio surrogates are most appropriate for a specific system, and to help researchers develop research methodologies. After a brief review of the roles that surrogates play in retrieval and sense making, and of some characteristics of audio data, five types of audio surrogates are defined, potential applications are illustrated, and implementation issues are discussed. The paper concludes with a discussion of the implementation issues related to multiple kinds of surrogates in practical video retrieval systems.

back to top

TR-2006-01

Wildemuth, Barbara M.; Russell, Terrell; Ward, T. J.; Marchionini, Gary; & Oh, Sanghee. "The Influence of Context and Interactivity on Video Browsing"

Abstract: The goal of this study was to investigate the effects of providing context and interactivity in a retrieval system, supporting the browsing of search result sets. Thus, three systems were developed: (1) a basic system, modeled on the current results list provided by google video searching (runs UNC-BAS-1 and UNC-BAS-2); (2) a similar system, with the context of each shot provided by showing keyframes from the shots appearing just before and after the retrieved shot (runs UNC-CON-1 and UNC-CON-2); and (3) a system that builds on the previous system by offering several mechanisms of interactivity (runs UNC-INT-1 and UNC-INT-2). In terms of both performance and user perceptions, the Context+Interactive system was superior. While there were no differences in precision, recall was improved with this system, and users preferred it (based on several measures of user perceptions). The effects of context on browsing search results were negligible, but should be explored further through re-examination of the definition and operationalization of the concept of context. Interactivity, in combination with context, had positive effects on browsing effectiveness; it was considered easy to use, even though it introduced more complexity into the interface.

back to top

TR-2005-03

Marchionini, Gary; Elsas, Jon; Zhang, Junliang; Efron, Miles; and Haas, Stephanie. "Clustering Techniques, Tools, and Results for BLS Website Maintenance and Usability" October 15, 2005

Abstract: This project was a BLS-focused adjunct to a National Science Foundation Digital Government grant to define a statistical knowledge network and user interfaces that will help citizens easily find and understand government statistical information. The BLS effort focused on discovering ways to automatically categorize BLS webpages and use these new categorizations in dynamic user interfaces under development in the larger project. The overall aim was to create alternative organizations for the BLS website that people could use to explore and find data more easily and effectively.

back to top

TR-2005-02

MacMullen, W. John. "Annotation as Process, Thing, and Knowledge: Multi-domain studies of structured data annotation" May 20, 2005

Abstract: Following Buckland’s (1991) work on the nature of information, this paper characterizes the multi-faceted concept of ‘annotation’ as process, thing, and knowledge. This typology is then used to enumerate general research questions for the exploration of annotation in arbitrary domains. Our research team’s investigation of annotation of structured data in specific domains and user groups is described, including library catalogers, musicians, historical geographers, web users, statistical analysts, and biomedical researchers.

back to top

TR-2005-01

Bergquist, Ron. "American Public Library Topics an Annotated Bibiliography" May 10, 2005. By the members of the Spring 2005 Public Libraries Seminar at the School of Information and Library Science, University of North Carolina at Chapel Hill.

Abstract: During the spring 2005 semester at the School of Information and Library Science at the University of North Carolina at Chapel Hill, the fifteen members of the Public Libraries Seminar considered the state of the American Public Library from several aspects.

After pondering the philosophical, political, professional, and ecological contexts in which the public library exists, each of the members guided the seminar through a topic area that held special meaning for them. The result of this guided tour is the annotated subject bibliography contained in this report.

back to top

TR-2004-02

Pomerantz, J., & Stutzman, F.

"Lyceum: A Blogsphere for Library Reference"

Abstract: In this paper we discuss the use of blogs in libraries, and specifically the potential of blogs for use in library reference services. We describe Lyceum, an open source software project designed by ibiblio.org, which is a facilitator of blogspheres and a tool for
intelligent automatic information management within blogspheres. We discuss ways in which Lyceum and blogs in general may facilitate library reference services.

TR-2004-01

Zhang, Junliang; Marchionini, Gary; Shear, Tim; Su, Chang.

Relational Browser: A Fast and Contextualized Searching and Browsing Tool
. January 31, 2004.

Abstract: The Relation Browser is a user interface for searching and browsing that supports visual exploration of relationships in datasets. This report describes the latest version of this interface, named RB++. It discusses improvements over previous versions and outlines a user study to test its effectiveness. RB++ uses an improved database scheme, supports arbitrary n-wise exploration within collection facets, closely couples collection overviews with results sets, and adds string search within results sets that are also coupled to the overviews. The system is illustrated with data from the UNC film collection and webpages from the U.S. Energy Information Administration website.

back to top

 

TR-2003-05

Wildemuth, Barbara; Yang, Meng; Hughes, Anthony; Gruss, Rich; Geisler, Gary; Marchionini, Gary. Access via Features versus Access via Transcripts: User Performance and Satisfaction. November 2003.

Abstract: The Open Video Project is specifically concerned with the surrogates that can represent the objects in a digital video collection and the mechanisms through which people can manipulate those surrogates. In TREC VID 2003, we compared the effectiveness of a transcript-only search system, a features-only search system and a search system combining transcript and feature searching. We also presented several different views for users to browse the results pages: a horizontal view, a vertical view, a “before & after” view, and an extra-keyframe view. A within-subjects research design was used, so that each of the 36 participants was exposed to all three search systems. Each participant searched half (12) of the assigned topics. The user satisfaction measures recommended by NIST were augmented by measurements of participants' perceived usefulness, perceived ease of use, and flow. Results indicated that, with the transcript-only system and the combined system, users were able to achieve higher recall in less time per search. The results from the measures of satisfaction indicate that the users found the transcript-only and combined systems to be more useful and easier to use, and their use resulted in stronger perceptions of enjoyment and concentration than the features-only system. It is concluded that, as users gain experience with features searching, it will be a welcome supplement to transcript searching.

back to top

 

TR-2003-04

Barreau, Deborah. The New Informational Professional: Vision and Practice. August 2003.

Abstract: Budget pressures and the proliferation of accessible information on the World Wide Web are among the reasons why several organizations have closed their libraries. Some visionaries suggest that to be viable under these conditions, information professionals should be more integrated with the work of organizations, becoming members of functional teams and providing both traditional and specialized services to these teams. This report describes a case study of four news organizations, two that have adopted this new model for the information professional, and two that have not. Data from newspaper articles and responses to surveys are examined for evidence that the new model influences how services are provided and valued. Although few differences are observed, findings demonstrate the benefits of this model for organizations.

back to top

 

TR-2003-03

MacMullen, W. John. Requirements Definition and Design Criteria for Test Corpora in Information Science. April 2003.

Abstract: This paper argues that structured collections of data and information ("corpora") are needed for research in information science, and to measure the validity, accuracy, and effectiveness of tools, methods, and systems. It examines the needs and uses of corpora, and describes some specific examples from a variety of domains. The paper explores the relationship of scientific methods to corpora design, and then enumerates and discusses a variety of design criteria, primarily from the corpus linguistics literature.

back to top

 

TR-2003-02

Yang, Wildemuth, Marchionini, Wilkens, Geisler, Hughes, Gruss and Webster. Measures of User Performance in Video Retrieval Research. June 2003.

Abstract: Browsing and searching for digital videos online is not as easy as it is with text documents. To address this problem, researchers have begun to create video surrogates to represent video objects. The purpose of this paper is to describe and provide preliminary data regarding six measures that can be used to evaluate the effectiveness of people's interactions with video surrogates. The six types of performance to be measured are object recognition (with text stimuli), object recognition (with graphical stimuli), action recognition, gist determination (free text), gist determination (multiple choice), and visual gist determination. While some additional development of the measures is needed, their initial field testing indicates that they are practical and can differentiate multiple levels of performance with video surrogates. These measures will continue to be refined in studies conducted by the Open Video project; we also encourage others to employ them in video retrieval research.

back to top

 

TR-2003-01

Dominick, Hughes, Marchionini, Shearer,Su and Zhang. Portal Help: Helping People Help Themselves Through Animated Demos. February 2003.

Abstract: This paper describes a rationale for animated demos to help people understand how to complete specific tasks in a WWW environment. A set of animated helps were created to assist people in adding, deleting, moving, and rearranging a portlet as well as checking library records in the UNC MyPortal application. The process of creating the animated demos is described and pointers to the online animations are given.

back to top

 

TR-2002-04

Osborne, Caroline and Rinalducci, Jennifer. Evaluation of Web-Based Resources within the Art History Discipline. December 2002.

back to top

 

TR-2002-03

Efron, Miles. Amended Parallel Analysis for Optimal Dimensionality Reduction in Latent Semantic Indexing, December 2002.

Abstract: This study describes amended parallel analysis (APA), a novel method for dimensionality estimation in unsupervised learning problems such as information retrieval (IR). At issue is the selection of k, the number of dimensions retained under latent semantic indexing (LSI). APA is an elaboration of Horn's parallel analysis, which advocates retaining eigenvalues larger than the values we would expect under term independence. APA operates by deriving confidence intervals on these “null eigenvalues.” The technique amounts to a series of non-parametric hypothesis tests on the correlation matrix eigenvalues. In the study, APA is tested along with five previous dimensionality estimators on four standard IR test collections. These estimates are evaluated with regard to two standard IR performance metrics. APA appears to perform well, predicting the best values of k on three of eight observations, and never offering the worst estimate of optimal dimensionality.

back to top

 

TR-2002-02

Wildemuth and Carter. The Perceived Affordances of Web Search Engines: A Comparative Analysis, December 2002.

Abstract: One way to evaluate the interfaces of search engines is to analyze the perceived affordances offered by each. In this context, the perceived affordances of a search engine are those aspects of the interface that are perceived by its users as allowing particular functions to be invoked. For example, if a search engine provides one text box that is only 10 characters long, users may perceive that it affords the searching of terms that are 10 characters or less. This study analyzed and compared the perceived affordances of nine of the most popular Web search engines (AltaVista, Ask Jeeves, Excite, Google, Hotbot, LookSmart, Lycos/Open Directory, Northern Light, and Yahoo) in September 2001. The criteria for analysis included characteristics of the text box for entering terms, characteristics of the search button, search syntax, the availability and placement of help for entering search terms, methods for limiting the search results, support for modifying a query, features of the directory structure, characteristics of results displays, and methods for setting user preferences. The analysis was conducted by directly examining the interface of each search engine for each feature or characteristic. In general, some aspects of Web search engine interfaces are becoming more standardized and other aspects vary widely across the search engines. All search engines provide a textbox and some type of accompanying button for entering a query. Almost all of the search engines provide assistance in specifying a query, but only two provide examples of queries on the search page itself. All of the search engines use the same basic syntax for specifying a query, but there is quite a bit of variation in the type and amount of assistance provided via drop-down menus or checkboxes. Direct support for modifying a query was available in only a few of the search engines. The results of searches are reported in fairly standard ways: brief summaries provided, usually 10 per page, in relevance order. The implications of these findings for the design of search engines are discussed.

back to top

 

TR-2002-01

Mu & Marchionini. Interactive Shared Educational Environment (ISEE): Design, Architecture, and User Interface, April 2002.

Abstract: The Interactive Shared Educational Environment (ISEE) is an advanced real-time multimedia application that supports highly interactive collaboration and distance learning activities within a heterogeneous network context. The ISEE not only takes full advantage of fast LAN campus networks or Internet2 wide area networks by providing peer-to-peer multicast support, but it also can be used in the less advanced settings of home users. Media (e.g., digital video) are integrated into a desktop style interface in the ISEE. The ISEE allows users to interact with live multicasts, a shared web browser, shared video/audio with thumbnails for quick navigation, and text chat. Collaborative work is supported via a shared time line across the multiple ISEE tools. For example, each comment in the chat text panel is associated with a timestamp, which indicates at which point during the video viewing (i.e., in what context) the comment was made. Clicking that timestamp by another user will jump that user's video player to the same timestamp. One challenge for a collaborative distance learning (CDL) system is to support a high degree of interaction between users and the video player due to the delays associated with re-buffering the video. With pre-buffering and a novel collaboration protocol, ISEE not only supports dynamic user-media interactions in real time, but also guarantees synchronization across participants. The system interface and architecture are discussed.

back to top

 

TR-2001-08

Hara, Solomon, Sonnenwald, & Kim. An Emerging View of Scientific Collaboration: Scientists' Perspectives on Collaboration and Factors that Impact Collaboration, December 2001.

Abstract: Collaboration is often a critical component in scientific research, which is dominated by complex problems, rapidly changing technology, dynamic growth of knowledge, and highly specialized areas of expertise. An individual scientist can seldom provide all of the expertise and resources necessary to address complex research problems. This paper describes collaboration among a group of scientists, and considers how their experiences are socially shaped. The scientists were members of a newly formed distributed, multi-disciplinary academic research center that was organized into four multi-disciplinary research groups. Each group had 14 to 34 members, including faculty, postdoctoral fellows and students, at four geographically dispersed universities. To investigate challenges that emerge in establishing scientific collaboration, data were collected about members' previous and current collaborative experiences, perceptions regarding collaboration, and work practices during the center's first year of operation. The data for the study includes interviews with members of one research group, observations of videoconferences and meetings, and a center-wide sociometric data analysis has led to the development of a framework that identifies forms of collaboration that emerged among scientists (e.g., complementary and integrative collaboration) and associated factors, which influenced collaboration including personal compatibility, work connections, incentives and infrastructure. These results may inform social and organizational practices needed to establish collaboration in distributed, multi-disciplinary research centers.

back to top

 

TR-2001-07

Wildemuth, Sonnenwald, Bollenbacher, Byrd & Harmon. Mentoring Future Biologists via the Internet: Results from the “Electronic Mentoring for Tomorrow's Scientists” Program, September 2001.

Abstract:  The E-Mentoring program provided biology students from two historically minority universities in North Carolina with opportunities to interact and develop relationships with corporate scientists, to expand their learning horizons, and to use technology in a meaningful way. To provide a meaningful context for electronic mentoring for students, the project was integrated with undergraduate and graduate biology courses at rural and urban universities in lower socio-economic areas. To learn from this experience, an intensive evaluation was conducted. Each participant filled out a detailed questionnaire and was interviewed, both before and after their participation in the E-Mentoring program. In addition, messages between students and mentors were archived. These data are analyzed and discussed in this report.

back to top

 

TR-2001-06

Brunk & Marchionini. Toward an Agile Views WWW Sitemap Kit: The Generalized Relation Browser, January 2001.

Abstract:  This paper describes the desgin, development, and testing of one component for creating data-driven sitemap tools used to enhance information seeking on the web. Such tools are set up by a website administrator to provide alternative browsing and navigation aids. The Generalized Relation Browser (GRB) illustrates the look-ahead strategy for web navigation within the Agile Views design framework. Agile views define control mechanisms and interfaces for overviews, previews, reviews, peripheral views, and shared views to help people make better decisions while browsing and exploring. The GRB is a follow up to the Federal Statistics Rleation Browser prototype and usability and field test results are summarized. The architecture for using GRB as a general purpose tool is described.

back to top

 

TR-2001-05

Jackson-Sanborn, Odess-Harnish, & Warren. Website Accessibiity: A Study of ADA Compliance, June 2001.

Abstract:  As larger portions of the population accesses the Internet, websites must take in consideration the needs of users with various disabilities. Given that about one in five Americans has some form of disability (Census), it not surprising that much attention is given to applying the Americans with Disabilities Act (ADA) to Internet site designs. This project used the Bobby analysis tool to examine 550 websites in six categories for ADA compliance. 100 websites in the categories of most popular, international, jobs, college, and government, and 50 websites in the category, clothes were selected using the What'sHot web site. The sites were examined in the Spring of 2001. Only one-third of all the sites were found to be compliant at priority 1 with no user check errors required. Government sites were the most compliant, with 60% of the sites passing. The other categories were compliant at the following levels: college (43%), clothes (40%), international (29%), jobs (19%), and most popular (15%). See the report for other analyses at other levels of compliance.

back to top

 

TR-2001-04

Gilchrest & Long. An Analytical Study of Browsing Strategies in a Content-Based Image Retrieval System, June 2001.

Abstract:  Image retrieval systems are available on the WWW but there are few studies of how people actually search for images using content-based image retreival systems. This study applied Kwasnik's functional components of browsing to create a model of browsing for images. This model was then examined by analyzing transaction logs from a WWW CBIR service that used query-by-example entry. Almost one-quarter of the users abandoned the site before executing a search, a quarter of the users opted for a random starting image, 15% typed an image number (known item search) and remainder picked one of the sample images to begin browsing. In addition, the authors took turns conducting the same searches, observing search strategies and behaviors. Both sets of data suggest that Kwasnick's browsing model applies to image browsing.

back to top

 

TR-2001-03

Sonnenwald, Marchionini, Wildemuth, Dempsey, Viles, Tibbo, Smith. Collaboration Services in a Participatory Digital Library: an Emerging Design, February 2001.

Abstract: Digital libraries need to provide and extend traditional library services in the digital environment. This paper presents a project that will provide and extend library services through the development of a sharium--a workspace with rich content and powerful tools where people can collaborate with others or work independently to explore information resources, learn, and solve their information problems. A sharium is a learning environment that combines the features of a collaboratory, where people collectively engage in research by sharing rich information resources, and a local library, where people come to meet, find information resources, and discuss common interests. To achieve this, collaboarion servies that build on synchronous and asynchronous communication technology should be integrated with other digital library services, including searching, browsing, and information management and authoring services. This paper presents our motivation for providing collaboration services and describes the types of collaboration services that will be included in the digital library.

back to top

 

TR-2001-02

Sonnenwald, Bolliger, Solomon, Hara, Cox. Collaboration in the Large: Using Video Conferencing to Facilitate Large Group Interaction, January 2001.

Abstract:  Large group collaboration is a strategic component of many research and development (R&D) centers today. Centers may have 50 to 100 or more participating principal investigators, undergraduate and graduate students, postdoctoral fellows and industry members. Because center members are geographically distributed and may not have interacted with each other previously, it can be difficult to establish and maintain collaboration among members. To address this challenge in the NSF Science and Technology Center for Environmentally Responsible Solvents and Processes, we are applying an action research approach that considers social/organizational and technical aspects of large group collaboration when establishing mechanisms to facilitate collaboration among group members. This paper describes the social, organizational and technical infrastructure and best practices that have emerged using large group video conferencing technology to support collaboration in the large. Social and organizational practices that have evolved include: facilitation before, during and after video conference meetings; the adoption of visual aids to match video conference technology constraints; and the adaptation of participant etiquette. Technical practices that have evolved include: upgrades to video conference equipment; the use of separate networks for broadcasting camera views, presentation slides, and voice; and implementing new technical operations practices to support dynamic interaction among participants at each location.

back to top

 

TR-2001-01

Sonnenwald, Wildemuth. Investigating Information Seeking Behavior Using the Concept of Information Horizons, January 2001.

Abstract: As research questions and topics in information studies evolve, there is a continual need to seek out innovative research methods to help us investigate and address these questions. This paper presents an emerging research method, the creation and analysis of information horizon maps, and discusses the use of such maps in an ongoing research study. Sonnenwald's (1999) framework for human information behavior provides a theoretical foundation for this method. This theoretical framework suggests that within a context and situation is an 'information horizon' in which we can act. Study participants are asked to describe several recent information seeking situations for a particular context, and to draw a map of their information horizon in this context, graphically representing the information resources (including people) they typically access and their preferences for these resources. The resulting graphical representation of their information horizons are analyzed in conjunction with the interview data using a variety of techniques derived from social network analysis and content analysis. In this paper these techniques are described and illustrated using examples from an ongoing study of the information seeking behavior of lower socio-economic students. These techniques are compared to other techniques that could be used to gather data about people's information seeking behavior.

back to top

 

TR-2000-03

Webster, Brassell, Sonnenwald, Wildemuth, Harmon, Byrd, Bollenbacher. E-Mentoring Handbook, September 2000.

Abstract: We describe lessons learned from two pilots of an electronic mentoring program to connect undergraduate and graduate science students in lower socio-economic with corporate scientists. All program activities and materials developed and used during the pilot are described. This includes brochures, web-based e-mentoring software, training materials and evaluation materials.

back to top

 

TR-2000-02

Geisler, Gary. Enriched Links: A Framework For Improving Web Navigation Using Pop-Up Views, February 2000.

Abstract: We describe a conceptual framework for enriching Web links by displaying small, information-rich visualizations-pop-up views-that provide the user with information about linked pages that can be used to evaluate the appropriateness of the pages before making a commitment to select the link and wait for the page to load. Examples of how the enriched links framework could be applied in several contexts, such as e-commerce catalog pages, search results for a video repository, and desktop icons, are also presented.

back to top

 

TR-2000-01

Marchionini, Gary; Geisler, Gary; Brunk, Ben. Agileviews: A Human-Centered Framework for Interfaces to Information Spaces, January 2000.

Abstract: A framework for interface design that provides people with flexible control over different views for an information space is presented. The agileviews framework defines overviews, previews, reviews, peripheral views, and shared views that help people make decisions about where they should focus attention during information seeking. In addition to the views themselves, control mechanisms that facilitate low-effort actions and strategies for coordinating the views are discussed. Agileviews are particularly useful when specific partitions of large information spaces such as the WWW have been identified. Examples of these views are provided from several different projects and suggestions for additional research and development are made.

back to top

 

TR-1999-05

Dempsey, Bert J.; Weiss, Debra; Jones, Paul; Greenberg, Jane. A Quantitative Profile of a Community of Open Source Linux Developers, October 6, 1999.

Abstract: Open source software, or free software, has generated much interest and debate in the wake of a number of high-impact applications and systems produced under open source models for development and distribution. Despite the high degree of interest, little hard data exists to-date on the membership of collaborative open source communities and the evolutionary process of their repositories. This paper contributes a baseline quantitative study of one of the oldest continuous repositories for the Linux open source project (the UNC MetaLab Linux Archives), including demographic information on its broad community of developers. Our methodology is a close examination of collection statistics, including custom monitoring scripts on the server, as well as an analysis of the contents of user-generated metadata embedded within the Archives. User-generated metadata files in a format known as the Linux Software Map (LSM) are required when submitting open source software for inclusion in non-mirrored portions of the MetaLab Linux Archives. The over 4500 LSMs in the Archives then provide a demographic profile of contributors of LSM-accompanied software as well as other information on this broad subset of the Linux community. To explore repository evolution directly, an instrumented Linux Archives mirror was developed, and aggregate statistics on content changes seen over a month-long period are reported. In sum, our results quantify aspects of the global Linux development effort in dimensions that have not been documented before now, as well as providing a guide for more detailed future studies.

back to top

 

TR-1999-04

Viles, Charles L. Content Locality in Time-Ordered Document Collections, September 13, 1999.

Abstract: Using newswire data sources from the TREC corpus, we show that the distribution of relevant documents with respect to time can be decidely non-uniform. Many TREC topics show time-based clustering of relevant documents. We denote this clustering content locality and provide a simple metric for its measurement in time-ordered document collections. There is a marked positive correlation between content locality measurements from two time-sychronized data sources. Given this correlation, we show that knowledge of the distribution of content locality in one document source can provide modest improvement in retrieval results in a companion, time-synchronized document source. While this data is preliminary, it illustrates the potential of using time as an additional feature in retrieval.

back to top

 

TR-1999-03

Brunk, Benjamin D. Overview and Preview Tools for Navigating the World-Wide Web, July 31, 1999.

Abstract: This paper examines the problems inherent in navigating the World-Wide Web. It discusses the work done by others in crafting techniques, software products, and research prototypes that attempt to improve the browsing experience through the application of information visualization in the form of sitemaps. This paper also describes an animated technique to generate previews and overviews of a web site in order to get a better understanding of its contents. The final section includes a technical description of an early prototype tool that uses this animated technique, with preliminary findings from an informal feasibility study involving 19 subjects.

back to top

 

TR-1999-02

French, James C.; Viles, Charles L.. Personalized Information Environments: An Architecture for Customizable Access to Distributed Digital Libraries, February 8, 1999.

Abstract: We describe the conceptual architecture of a Personalized Information Environment (PIE). A PIE allows unified, highly customizable access to distributed information resources by providing users the tools to compose personalized collections from a palette of information resources. The architecture also provides for the efficient “exchange” of inter-resource meta-information like collection statistics in order to maximize retrieval effectiveness. This paper includes the enunciation of the user-centered PIE vision, an architectural requirements specification, and an architectural description that meets the specification and supports the vision. We also describe our current implementation and research efforts conducted within the PIE framework.

back to top

 

TR-1999-01

Dempsey, Bert J.; Weiss, Debra. Towards an Efficient, Scalable Replication Mechanism for the I2-DSI Project, April 30, 1999.

Abstract: This paper presents the development of new functionality for the open-source rsync utility aimed at producing an efficient, scalable solution for multiple-site file synchronization. The context of our work is the Internet2 Distributed Storage Infrastructure (I2-DSI) project, which is developing a reliable, scalable, high performance storage service infrastructure for advanced applications in research and education. Specifically, the I2-DSI project is working on middleware software to enable the replication of applications across a set of geographically distributed hosts. This paper presents a new mechanism for replicating filesystems, rsync+, which is a modification of an open-source rsync file synchronization utility. Using rsync+ for file updates, a flexible, powerful replication mechanism can be developed for publishing source objects into the I2-DSI replication service, and the approach enables scalable network distribution through multicast-based solutions. The paper presents the technical details behind the rsync+ tool, its use as a replication solution within I2-DSI, and performance results from a large-scale (multi-gigabyte) WWW mirroring experiment using rsync+. The mirroring experiment demonstrates correct operation of the rsync+ code and its efficiency gains when used on actual data from an active WWW document archive.

back to top