by Carol Peters
The results of the eighth campaign of the Cross-Language Evaluation Forum were presented at a two-and-a-half day workshop held in Budapest, 19-21 September, immediately following the eleventh European Conference on Digital Libraries (ECDL 2007). The workshop was attended by 120 researchers and system developers from academia and industry.
The main objectives of the Cross-Language Evaluation Forum (CLEF) are to stimulate the development of mono- and multilingual information retrieval systems for European languages and to contribute to the building of a research community in the multidisciplinary area of multilingual information access (MLIA). These objectives are realised through the organisation of annual evaluation campaigns and workshops. The scope of CLEF has gradually expanded over the years. While in the early years, the main interest was in textual document retrieval, the focus is now diversified to include different kinds of text retrieval across languages and on different kinds of media.
In CLEF 2007 seven tracks were offered to evaluate the performance of systems for:
- mono-, bi- and multilingual document retrieval on news collections (Ad-hoc)
- mono- and cross-language structured scientific data (Domain-Specific)
- multiple language question answering (QA@CLEF)
- cross-language retrieval on image collections (ImageCLEF)
- cross-language speech retrieval (CL-SR)
- multilingual web retrieval (WebCLEF)
- cross-language geographic retrieval (GeoCLEF).
Most of the tracks adopt a corpus-based automatic scoring method for the assessment of system performance. The test collections consist of sets of statements representing information needs known as topics (queries) and collections of documents (corpora). System performance is evaluated by judging the documents retrieved in response to a topic with respect to their relevance (relevance assessments) and computing recall and precision measures.
The following document collections were used in CLEF 2007:
- CLEF multilingual comparable corpus of more than three million news documents in 13 European languages
- CLEF domain-specific corpora: English/German and Russian social science databases
- Malach collection of spontaneous speech in English and Czech, derived from the Shoah archives
- EuroGOV, ca 3.5 M webpages crawled from European governmental sites.
The ImageCLEF track used collections for both general photographic and medical image retrieval:
- IAPR TC-12 photo database; PASCAL VOC 2006 training data
- ImageCLEFmed radiological database consisting of six distinct datasets; IRMA collection for automatic image annotation.
Diverse sets of topics or queries were prepared in many languages according to the needs of the various tracks. For example, this year the Ad Hoc track offered mono- and bilingual tasks for central European languages (Bulgarian, Czech and Hungarian) plus a bilingual task encouraging system testing with non-European languages against English documents. Topics were made available in Amharic, Chinese, Oromo and Indonesian. A special sub-task regarded Indian languages with Hindi, Bengali, Tamil, Telugu and Marathi proposed for search tasks against an English target collection.
Participation again showed a good mix of newcomers and veteran groups with long experience at CLEF. 81 groups submitted results for one or more of the different tracks: 51 from Europe, 14 from North America, 14 from Asia, and just one each from South America and Australia.
The annual workshop plays an important role by providing the opportunity for all the groups that have participated in the evaluation campaign to get together comparing approaches and exchanging ideas. The schedule was divided between plenary track overviews, plus parallel, poster and breakout sessions presenting this years experiments and discussing ideas for the future. There were several invited talks. Noriko Kando, National Institute of Informatics Tokyo, reported the lessons learned at NTCIR-6 and plans for NTCIR-7 (NTCIR is an evaluation initiative focussed on testing IR systems for Asian languages), while Mandar Mitra, Indian Statistical Institute Kolkata, presented FIRE, a new Forum for Information Retrieval Evaluation for Indian languages. Eduoard Geoffrois of the French government described the objectives of the much publicised and ambitious Quaero programme, which has the goal of developing multimedia and multilingual indexing and management tools for professional and general public applications.
The presentations given at the workshop and detailed reports on the experiments of CLEF 2007 and previous years can be found on the CLEF website. The preliminary agenda for CLEF 2008 will be available from mid-November.
From CLEF to Treble-CLEF
Over the years, CLEF has done much to promote the development of multilingual IR systems. However, the focus has been on building and testing research prototypes rather than developing fully operational systems. We believe that the time is now mature to begin to transfer the knowledge acquired to an application setting and for this reason we are about to launch a new activity, "Treble-CLEF" with three main goals:
- To promote high standards of evaluation in MLIA systems using three approaches: test collections; user evaluation; and log file analysis
- To sustain an evaluation community by providing high quality access to past evaluation results
- To disseminate knowhow, tools, resources and best practice guidelines, enabling information system developers to make content and knowledge accessible, usable and exploitable over time, over media and over language boundaries.
The aim will be to provide applications that need multilingual search solutions with the possibility to identify the technology which is most appropriate and to assist technology providers to develop competitive multilingual search solutions.
Coordinator of CLEF and Treble-CLEF