Digital Libraries Colloquium


Mark DerthickMark Derthick, Research Scientist in CMU’s School of Computer Science, will be a featured speaker at the upcoming Digital Libraries Colloquium Series event on December 6, 2006.  This series, now in its sixth season, brings internationally-recognized experts to discuss both technology and human behavior in terms of Digital Libraries.  Dr. Derthick will discuss “Exploratory Data Analysis and Visualization for Everyone” in his presentation, which will be held at 1:00 pm in Room 501, School of Information Sciences.

The Digital Libraries Colloquium is sponsored by the School of Computer Science-Carnegie Mellon University, the School of Information Sciences-University of Pittsburgh, the University Library System-University of Pittsburgh, the University Libraries-Carnegie Mellon University and the Carnegie Library of Pittsburgh. 

Derthick will discuss how Internet search engines have attracted widespread demand for information retrieval from unstructured documents. The number of structured and semi-structured documents available on the Web is huge and collections of these are more amenable to data mining than search engine retrieval.  Finding patterns in databases of political contributions, environmental data, or hospital and school performance would surely interest many citizens. However, compared to search engines, there has been no similar explosion of interest in data mining. Why?

The main research question is how to support such exploration for users with little or no training in statistics or programming.  In contrast to other data mining systems, Bungee View focuses on learnability, responsiveness, robustness, and providing a satisfying user experience. This talk will describe users experience with Bungee View in the lab and on three Web-based image collections.

Bio: Mark Derthick received his PhD in Computer Science from Carnegie Mellon University in 1988 for his thesis that Connectionist models of knowledge representation and reasoning would degrade more gracefully than symbolic frameworks in the face of incomplete and inconsistent information. His current projects are summarizing probability distributions over tens of thousands of possible evolutionary trees for biologists, and developing an enjoyable interface for non-technical users to browse and data-mine image collections.