The following are demonstrations of MetaCombine technologies and methods:
This system allows browsing of all of the schemes we tested for the digital library-only collection (records from AmericanSouth.org). This includes both the "control" scheme and AI-based schemes which employ classification or clustering. What's left to do: improve hierarchically-clustered scheme.
This is the first combined search system we built. It allows searching both web resources (crawled from the AmericanSouth.org Weblinks items) and AmericanSouth.org native digital library records. This combined search system was built by exposing AmericanSouth.org OAI records as web pages with DP9, and then building a Swish-e search system over top of the union of the resulting web pages plus the web links pages.
This is our second combined search system. For this one, we conducted a focused crawl, starting with the weblinks URLs and others which were selected by experts. We then inferred Dublin Core metadata records for the crawl records and built an OAI repository of them. Finally, we built a search engine web interface using Lucene as the web search back-end, drawing on both the crawl collection and AmericanSouth.org OAI repositories.
The visual browsing demo is an interactive java application allowing many styles of visualizations to be applied to a sample clustered collection. The application allows navigating a hierarchical category/cluster structure, lists of resources within the clusters, and represents many attributes of the clusters and resources visually. The visualization browsing demo uses the Prefuse viz toolkit.
This this system allows the navigation of browse schemes over web collection content (acquired via focused crawling). It consists of classification to the Encyclopedia of Southern Culture (ESC) scheme, flat NMF clustering, and hierarchical NMF clustering. This is actually the same system as the A1 browse demo, but viewing a different dataset. What's left: improve hierarchical scheme.
As with the A1 and A2 demos, this system allows the navigation of browse schemes over a collection of resources. In this case the resources are the union set of the native digital library records and focused crawl records, both discussed above. This demo contains the ESC scheme as well as two novel cluster-based schemes (hierarchical and flat), generated from the unique underlying data of the union set. What's left: improve hierarchical cluster scheme.
Note: Code prefixes on demo systems correspond to project phases from proposal.