by Haridimos Kondylakis, Georgia Troullinou, Kostas Stefanidis and Dimitris Plexousakis (ICS-FORTH)
The recent explosion of the web of data and the associated Linked Open Data (LOD) initiative have led to an enormous number of widely available RDF datasets. These datasets often have extremely complex schemas, which are difficult to comprehend, limiting the exploitation potential of the information they contain. As a result, there is now, more than ever, an increasing need to develop methods and tools that facilitate the quick understanding and exploration of these data sources.
To this direction, many approaches focus on generating ontology summaries. Ontology summarisation is defined as the process of distilling knowledge from an ontology in order to produce an abridged version. Although generating summaries is an active field of research, the generated summaries in most of cases are static, limiting the exploration and the exploitation potential of the information they contain. As a result, there is now, more than ever, an increasing need to develop methods and tools to help facilitate the understanding and exploration of various data sources.
RDFDigest [4] is a tool that was initially developed in the context of eHealthMonitor FP7 project, for summarising multifaceted linked health data, exploiting measures such as relevance and centrality to identify the most important nodes and novel algorithms for linking them together [3]. The tool was extended [1] within the MyHealthAvatar FP7 and the iManageCancer H2020 projects, including seven more measures for identifying importance according to various approaches in the literature. In addition, new algorithms were introduced, trying to minimise the additional non-important nodes that should be added for linking the selected most important nodes in the final summary produced. Although end-users can select among various importance aspects and linking algorithms, the user might still find the presented information overwhelming and would ideally see less information, focusing only on a specific subset of the presented nodes.
Figure 1: Zoom and extend operators.
To this direction, the second version of the tool, RDFDigest+ will soon be released, enabling users to actively further explore data sources, starting from the summaries presented to them. Zoom-in and zoom-out operators, allow users to explore the contents of a data source at a higher or lower granularity level, respectively. Furthermore, users might want to have more detailed information not only on the whole schema graph but on a selected subset of it. This can be achieved by selecting some nodes and requesting more detail on those, extending a specific graph part. The extend operation, in essence provides additional nodes dependent on the selected ones, giving more details on the selected schema part.
From a different perspective, we also focus on how to explore changes in evolving ontologies [2]. Specifically, given the big number of ontologies that constantly evolve, there is a clear need to monitor and analyse the changes that occur on them. Traditional approaches for studying the evolution of data focus on providing humans with deltas that include loads of information. In our approach, we propose a processing model that recommends evolution measures taking into account particular challenges, such as relatedness, transparency, diversity, fairness and anonymity. Our goal is to support humans with complementary measures that offer high-level overviews of the changes to help them understand how data of interest evolves.
Figure 2: The RDFDigest+ system.
References:
[1] A. Pappas, G. Troullinou, G. Roussakis, H. Kondylakis, D. Plexousakis, “Exploring Importance Measures for Summarizing RDF/S KBs”. ESWC (1) 2017: 387-403.
[2] K. Stefanidis, H. Kondylakis, D. Plexousakis, “On Recommending Evolution Measures: A Human-aware Approach”, Proc. of DESWeb @ ICDE 2017.
[3] G. Troullinou, H. Kondylakis, E. Daskalaki, D. Plexousakis, “Ontology understanding without tears: The summarization approach”. Semantic Web 8(6): 797-815 (2017).
[4] G. Troullinou, H. Kondylakis, E. Daskalaki, D. Plexousakis, “RDF Digest: Efficient Summarization of RDF/S KBs”. ESWC 2015: 119-134.
Link:
[L1]: http://rdfdigest.ics.forth.gr
Please contact:
Haridimos Kondylakis, FORTH-ICS, Greece