by Boris Motik
Scalability of ontology reasoning is a key factor in the practical adoption of ontology technologies. The KAON2 ontology reasoner has been designed to improve scalability in the case of reasoning over large data sets. It is based on a novel reasoning algorithm that builds upon extensive research in relational and deductive databases.
Ontologies - vocabularies of terms often shared by a community of users - are being applied in science and engineering disciplines as diverse as biology, geography, astronomy, agriculture and defence. Nowadays, ontologies are usually expressed in the W3C standard language called the Web Ontology Language (OWL). OWL ontologies consist of a schema part, called a TBox, which describes the concepts and relationships in the domain of discourse, and a data part, called an ABox, which describes the actual data in the application. An efficient reasoner is the cornerstone of most OWL-based applications. It implements the formal semantics of OWL and thus provides the application with query answering capabilities.
While reasoning over OWL ontologies is a provably intractable computational problem, it has been observed that the ontologies encountered in practice rarely involve a combination of constructs that leads to intractability. By relying on sophisticated optimizations, reasoners were developed that can handle ontologies with large Tboxes, yet these still do not provide adequate performance on ontologies containing large ABoxes. This has so far prevented the usage of OWL in applications that depend on large data sets, such as metadata management and information integration.
Parallel to the development of reasoning techniques for OWL, significant effort has been invested into improving the scalability of relational and deductive databases. In particular, numerous optimizations of query answering in (disjunctive) datalog (a widely used deductive database language) are known and have proven themselves effective in practice. It is therefore natural to try to improve the scalability of ABox reasoning in OWL by building on this large body of existing work.
This idea has been realized in a new reasoner called KAON2. The architecture of the reasoner is shown in Figure 1. The central component of the system is the reasoning engine, which implements a completely new reasoning algorithm. A query Q over an ontology consisting of a TBox T and an ABox A is answered by first reformulating T as a set of clauses in first-order logic, and then transforming the result into a disjunctive datalog program DD(T). The latter step is the key part of KAON2: based on certain novel results in resolution-based theorem proving, it ensures that all answers of Q over T and A can be equally computed by evaluating Q over DD(T) and A. The main benefit of such a transformation is that, to evaluate Q w.r.t DD(T) and A, the disjunctive datalog engine can use the optimization techniques known from deductive databases, such as (disjunctive) magic sets or join-order optimizations.
This approach to query answering has shown itself to be practical and effective in cases where the TBox is rather simple but the ABox contains large amounts of data. On such ontologies, KAON2 has shown performance improvements over the state of the art of one or more orders of magnitude. KAON2 has thus become the platform of choice for numerous research projects, such as FIT (EU IST 27090), OntoGov (EU IST 507237), NeOn (EU IST-2005-027595), X-Media (EU FP6-26978), and KnowledgeWeb (EU FP6-507482). Furthermore, ontoprise GmbH, the vendor of ontology-based software infrastructure based in Karlsruhe, Germany, is integrating KAON2 into its product suite and is using the tool in a commercial setting.
KAON2 is written in Java and can be used free of charge for non-commercial purposes. The tool has emerged as a result of the author's PhD work at the University of Karlsruhe, Germany. Its development was continued at the University of Manchester, UK, and is currently taking place at the University of Oxford, UK.
Boris Motik has been awarded the 2007 Cor Baayen Award for a most promising young researcher in computer science and applied mathematics by ERCIM.
Computing Laboratory, Oxford University, UK