by Vittore Casarosa and Carol Peters
Nearly forty researchers and practitioners in the IT and cultural heritage sectors participated in a one-day workshop on 'Semantic-driven Interoperability for Digital Objects in the Cultural Heritage Domain'. The workshop was organised as a joint DELOS MultiMatch event in conjunction with the DELOS Conference, held in Tirrenia, Pisa, 13-15 February 2006.
Interoperability is a hot topic within the digital library and distributed information retrieval research communities. This is also evidenced by the fact that the European Commission has just set up a working group on Interoperability and Multilinguism as part of the i2010 digital library initiative. The DELOS Network of Excellence and the MultiMatch specific targeted research project both have strong interests in this area. For this reason, it was thus decided to organise a joint DELOS - MultiMatch workshop in order to investigate the current state-of-the-art, and discuss those issues that currently hinder the widespread adoption of standards and impede interoperability.
The goal of Multimatch is to develop a system that will enable users to explore and interact with online cultural heritage (CH) content across media types and language boundaries (see ERCIM News 66, July 2006). This means that the project is acquiring large volumes of heterogeneous domain-specific data both directly from CH content providers but also via focussed web crawling. This data must be processed and categorised. The original idea for the workshop thus resulted from early discussions within MultiMatch aimed at defining of the most appropriate metadata schema and conceptual framework for the project. It was felt that it could be very beneficial to be able to exchange ideas and experiences with people working on similar problems.
DELOS has long been concerned with questions concerning interoperability and has published a comprehensive report on Semantic Interoperability in Digital Library Systems (publicly available on the DELOS website). The DELOS conference offered the perfect venue for this workshop and a number of experts in the field (both theoreticians and practitioners) were thus invited in order to share their expertise and experiences and advise the MultiMatch group.
The workshop opened with a brief presentation by Neil Ireson (University of Sheffield) in which he illustrated the main factors impacting on the definition of the MultiMatch knowledge representation framework, the problems currently being addressed and the solutions being considered. The aim was to set the context for the following discussions.
The remainder of the morning session was dedicated to the keynote talks. Martin Doerr (FORTH, Crete), Maja ?mer (University of Ljubljana) and Chrisa Tsinaraki (Technical University of Crete) presented three of the best known existing conceptual frameworks (CIDOC-CRM, FRBR and MPEG-7) and some of the relationships between them. These talks were followed by a lively panel discussion, moderated by Stavros Christodoulakis, aimed at investigating how these frameworks can be made interoperable. During this discussion, Martin Doerr pointed out that in his opinion there is a fundamental confusion between the schema and the ontology levels: ontologies are about the underlying concepts, schemas are concerned with the data. In his opinion there is no reason not to agree on the concepts and a common vision should be possible. Doerr stressed that an ontology such as CIDOC is neutral, it only tells the implementers what kind of reasoning and what relationships are possible, but it is up to them to decide to what level of detail they wish to go. And as a first step, it is essential to start by understanding what kind of queries are to be supported by the reference framework adopted. Chrisa Tsinaraki stated that although ontologies are developed for specific communities it is also important to aim for widely adopted generic standards - ontologies cannot be just domain-specific but must fit into an overall vision of the world
The first session in the afternoon was dedicated to a series of position statements by a number of projects and institutions working in the CH domain. The EDLProject, TEL, MICHAEL, BRICKS, IMAGINATION, EPOCH plus the Dutch Cultural Heritage Institution briefly presented the problems they are currently facing in this area and/or the solutions they are adopting. The last speaker presented the perspective of the Text Encoding Initiative, the work being done by the TEI Ontologies SIG working group and the problems this group has faced when trying to map from a TEI document to a model conforming to CIDOC-CRM.
The final session of the workshop began with a presentation of the objectives of the recently formed EC Interoperability Group by Stefan Gradmann (University of Hamburg). This triggered a discussion of the main issues that had emerged during the day, again moderated by Stavros Christodoulakis. Points raised included:
how do you combine different reference models?
how do you handle very heterogeneous data?
what kind of queries do users really want?
how can you handle incomplete and uncertain information, eg information crawled from the web?
Many participants felt that there is a conflict between the needs of the real world (ie achieving interoperability between schemata) and the conceptual level. What is needed is a common conceptual reference framework comprehensive enough to cover the multitude of detail required, while being, at the same time, both sufficiently simple to use and amenable to the application of automatic population techniques. There was general consensus that with the current state-of-the-art, it is difficult to envisage being able to achieve this goal. The working notes and presentations of the workshop can be found on both the DELOS and the MultiMatch websites.
The DELOS Network of Excellence on Digital Libraries is managed by ERCIM.
Vittore Casarosa, ISTI-CNR, Italy
Tel: +39 050 3153115
Carol Peters, ISTI-CNR, Italy
Tel: +39 050 3152897