by Joachim Jung, Rainer Simon and Bernhard Haslhofer
Annotations have been a means of scholarly communication and an invaluable research tool for centuries. The Open Source YUMA Universal Media Annotator takes this paradigm and makes it available for the collaborative annotation of online media resources. The YUMA suite of tools can currently be used for image, map, audio and video annotation. It is designed to be integrated into any host environment (like a digital library portal or an online media collection) and enables mash-ups across such environments by exposing annotation data according to the Linked Data principles. Linked Data is also the basis for one of YUMA’s unique features: Semantic Tagging. A semi-automatic mechanism provides users with tag suggestions that can be used to effortlessly augment annotations with structured context information, eg about places, persons of interest or historical periods.
Major memory institutions (libraries, archives, museums, or audio and video collections) have, over the last 2-3 decades, made substantial efforts to bring their collections closer to their users. These efforts have had two strands: (i) the digitization of original collections (like books or administrative records) and (ii) the start of new, “born-digital” collections. While these collections are now often accessible to the public via the World Wide Web, tools that support actual research – scholarly analysis, annotation, and communication – have been scarce. The YUMA Universal Media Annotator (YUMA) aims to provide some of these tools.
YUMA is an Open Source suite of browser-based applications that allow users to annotate different types of media content. It is being developed by the Digital Memory Engineering (DME) research group of the AIT Austrian Institute of Technology in cooperation with the Research Group Multimedia Information Systems at the University of Vienna. The system has seen a number of iterations since 2004 when a first proof of concept was developed as part of the BRICKS (Building Resources for Integrated Cultural Knowledge Services) EU project. It has seen further work in TELplus, a project of ‘The European Library’, the common portal of Europe's national libraries. The current system represents a complete overhaul both in terms of the technology and the user interface. It is currently being developed further as part of the EuropeanaConnect best practice network, which will run until October 2011. EuropeanaConnect is one of the projects set up to build Europeana, the portal that aims to give access to Europe's museums, libraries, archives and audio-visual collections.
Figure 1: YUMA map annotation screenshot.
YUMA is based on a distributed architecture. It is designed to be integrated into a host environment – eg an online library portal – and lacks typical portal features like user management. Instead it foresees appropriate APIs and authentication mechanisms which allow the host environment to use YUMA as an external, loosely-coupled service. The system consists of two core elements: the Annotation Suite, an extendable set of browser-based end-user tools for annotating content of specific media types (currently digital image, audio and video files, as well as digitized maps); and the Annotation Server, a common “backend" service used by all of those tools.
The Annotation Suite offers similar functionality across all supported media types: users can create new annotations, view or reply to existing ones and keep track of discussions around individual items or particular annotations via RSS feeds. Each tool provides appropriate selection features for annotating specific parts of an item: shape drawing for images or maps, or time range selection for audio and video material. The map annotation tool (Figure 1) includes a special interface with panning and zooming functionality (think Google Maps), and a set of geographical features such as map geo-referencing and overlay.
YUMA introduces a novel semi-automatic Semantic Tagging approach that lets users make their annotation semantically more expressive by adding links to relevant ‘Linked Data’ resources. To support users in this task, the tool automatically generates suggestions based on an analysis of the annotation text (and the selected geographic area of a map) and pre-configured Linked Data sets (eg DBpedia and Geonames). Suggestions are presented in the form of a tag cloud ( Figure1) from which the user can add proposed links to their annotation. Relevant properties of added resources are stored as part of the annotation metadata: eg alternative language labels, spelling variants or geo-coordinates. The thus enriched metadata can later be exploited in the portal to enable advanced search functionality, eg search in multiple languages, search by synonymous names, or geographical search.
The Annotation Server is the storage and administration backend of the YUMA Annotation Framework. It can be deployed with different relational database systems (such as MySQL or PostgreSQL). The different applications in the Suite access, store, update, and delete annotations through a REST API. The Server also offers search (through a GUI as well as through an API) and basic administration features, and provides the infrastructure for the RSS feed syndication. The server, in turn, can also act as a Linked Data resource itself: it exposes annotations with unique URIs, which return an RDF representation when resolved. To provide data interoperability, the tool relies on the OAC model, an emerging ontology for describing scholarly annotations of Web-accessible information.
Besides further development of YUMA’s feature set and user interface, future work primarily addresses system evaluation. We currently investigate the effect that structured metadata generated collaboratively by users through Semantic Tagging has on search & retrieval. For that purpose, we have created the COMPASS Map Labeller, an online portal used for the study of a map annotation use case. A first outcome of this effort will be a 'ground truth' for the evaluation of map search engines. In a second phase, this ground truth will be used to carry out precision and recall analyses on an annotated map collection. The analysis will help to quantify the improvement that is gained in terms of the quality of obtained search results, when search is performed on metadata enriched by Semantic Tags vs. on traditional metadata only.
YUMA: Demonstration: http://dme.ait.ac.at/annotation, source YUMA code: http://github.com/yuma-annotation
The European Library: http://www.theeuropeanlibrary.org/
Linked Data: http://linkeddata.org/
COMPASS Map Labeler: http://compass.cs.univie.ac.at/
Joachim Jung, Rainer Simon
AIT Austrian Institute of Technology / AARIT
Cornell University, USA