Getting your Data Back by Giving it Away

by Jacco van Ossenbruggen

CWI teams up with an Amsterdam-based consortium to develop web services on open linked data in the domain of cultural heritage. The approach is gaining momentum, but remains challenging from both the researchers' and museum's perspective.

The Dutch MultimediaN e-culture project is applying the idea of open linked data to the traditionally closed and isolated worlds of cultural heritage collections. Under the hood, new Semantic Web technology is developed to realize new search services and smarter Web interfaces that provide access to multiple, information rich, museum collections.

All features developed within the project are, however, judged by the way they succeed in conveying the main principle underlying the project: by providing the user meaningful relationships across collections, all individual data collections grow in value. In the same way explicit hyperlinks add value to all documents being linked to and from, explicit relationships on the Semantic Web add value by providing context to previously isolated data.

The project builds on the fact that Amsterdam happens to be the home of three research institutes with world class Semantic Web expertise: Vrije Universiteit, University of Amsterdam and CWI. The project team is completed by representatives from two key Dutch cultural heritage institutes: ICN and DEN. The project closely cooperates with a growing list of museums. It is financed by the Dutch natural gas reserves through the national government's BSIK program, and started in 2004. From the beginning, all partners have been cooperating closely, in a way that is rarely seen in computer science research projects. Comparable projects typically spend the last few months before the end of the project to build a proof-of-concept prototype that demonstrates that the innovations developed by the individual project partners actually work together. This project takes the very opposite route, and released the first version of an integrated prototype even before the first PhD student started to work on the project late 2005. As a result, all senior and junior researchers involved in the project use - and contribute to - the same experimentation platform from the very beginning.

The platform is based on the open source SWI-Prolog package, and all generic Semantic Web technology developed within the project is released as part of the standard distribution. Conversion software has been developed to convert museums' collection databases, thesauri and other sources of domain knowledge to RDF, and to create meaningful links between the sources. For CWI, the main research challenges are the design and evaluation of new search functionality that is made possible by the linked data, and the design and evaluation of the associated web interfaces.

Project leader Guus Schreiber (right) demonstrating the project to Tim Berners-Lee, Director of the World Wide Web Consortium, at ISWC 2006.

The project's approach payed off immediately: 8 months later, in August 2006, the second release of the platform was submitted to the International Semantic Web Challenge, a submission that turned out to be a winning one during the International Semantic Web Conference in November 2006. To project's impact goes, however, well beyond the computer science research world. Less than 5 months after the ISWC award ceremony, the project was presented before the international cultural heritage community, which was gathered in San Francisco in April 2007 for the Museums and the Web conference. There it turned out that the project had made the right decision to base its strategy on open and linked data. Museums and archives all over the world start to realize the serious limitations of vendor lock-in and closed proprietary solutions. While we researchers think about ways to provide a wide audience better access to the museum collections over the public Web, on the other side of the firewall, many museums fight to get access to their very own data, as it is locked inside proprietary software without public APIs.

In this context, it is clear that a future in which data is open may scare off many in the museum world. It takes time to get used to the idea that your own website is no longer the only way to access your' data. But without a doubt, sooner or later both the museum itself, unknown users and third parties will develop a wide range of new web services, mash-ups, widgets, social tagging applications, and much more based on the museum's data. This is simply because what the curators call `their' data represents `our' heritage, and there are just too many of us users interested in these rich information resources. Providing a wider public access to data in a commonly agreed upon, open and linkable format is definitely the way to go. It may even be the only way to get the large amounts of valuable data back that is now locked up inside proprietary formats.

Link:
http://e-culture.multimedian.nl/

Please contact:
Jacco van Ossenbruggen
CWI, The Netherlands
Tel: +31 20 592 4141
E-mail: Jacco.van.Ossenbruggencwi.nl