by Irini Fundulaki and Sören Auer

The Linked Data paradigm has emerged as a powerful enabler for publishing, enriching and sharing data, information and knowledge in the Web. It offers a set of best practices that promote the publication of data on the Web using semantic web technologies such as URIs and RDF, support the exchange of structured data to be done as easily as the sharing of documents, allow the creation of typed links between Web resources and offer a single, standardized access mechanism. In particular, the Linked Data shift is based on (1) using Universal Resource Identifiers (URIs) for identifying all kinds of “things”, (2) making these URIs accessible via the HTTP protocol and (3) providing a description of these things in the Resource Description Format (RDF) along with (4) URI links to related information (see Tim Berners-Lee’s Linked Data design principles

by Kostis Kyzirakos, Stefan Manegold, Charalampos Nikolaou and Manolis Koubarakis

TELEIOS is a recent European project that addresses the need for scalable access to petabytes of Earth Observation (EO) data and the identification of hidden knowledge that can be used in applications. To achieve this, TELEIOS builds on scientific databases, linked geospatial data and ontologies. TELEIOS was the first project internationally that introduced the Linked Data paradigm to the EO domain, and developed prototype services such as the real-time fire monitoring service that has been used for the last two years by decision makers and emergency response managers in Greece.

by Spiros Athanasiou, Daniel Hladky, Giorgos Giannopoulos, Alejandra Garcia Rojas and Jens Lehmann

The GeoKnow project aims to make geospatial data accessible on the Web of Data, transforming the Web into a place where geospatial data can be published, queried, reasoned, and interlinked, according to Linked Data principles.

by Mathieu d'Aquin and Stefan Dietze

Education is now entering a revolution in the form of open education, where Linked Open Data has the potential to play a vital role. The Web of Linked Educational Data is growing with information about courses and resources, and emerging as a collective information backbone for open education.

Education has often been a keen adopter of new information and communication technologies. This is not surprising given that education is all about informing and communicating, and it is currently entering a revolution in the form of open education. This requires the use of state-of-the-art technologies for sharing, publishing and connecting information globally, free from technological barriers and cultural frontiers: namely, Linked Data [1].

by Alexander Mikroyannidis, John Domingue and Elena Simperl

There is currently a revolution going on in education generally, but nowhere more so than in the ICT field, owing to the availability of high quality online learning resources and MOOCs (Massive Open Online Courses). The EUCLID project is at the forefront of this initiative by developing a comprehensive educational curriculum, supported by multimodal learning materials and highly visible eLearning distribution channels, tailored to the real needs of data practitioners.MOOCs (Massive Open Online Courses) offer large numbers of students the opportunity to study high quality courses with prestigious universities. These initiatives have led to widespread publicity as well as strategic dialogue in the higher education sector. The consensus within higher education is that after the Internet-induced revolutions in communication, business, entertainment and the media, it is now the turn of universities. Exactly where this revolution will lead is not yet known but some radical predictions have been made, including the end of the need for university campuses (

by Cristiano Fugazza, Alessandro Oggioni and Paola Carrara

The RITMARE (la Ricerca ITaliana per il MARE – Italian Research for the sea) Flagship Project is one of the National Research Programmes funded by the Italian Ministry of University and Research. Its goal is the interdisciplinary integration of national marine research. In order to design a flexible Spatial Data Infrastructure (SDI) that adapts to the audience's specificities, the necessary context information is drawn from existing RDF-based schemata and sources. This enables semantics-aware profiling of end-users and resources, thus allowing their provision as Linked Open Data.

by Florian Stegmaier, Kai Schlegel and Michael Granitzer

Although Linked Open Data has increased enormously in volume over recent years, there is still no single point of access for querying the over 200 SPARQL repositories. The Balloon project aims to create a Meta Web of Data focusing on structural information by crawling co-reference relationships in all registered and reachable Linked Data SPARQL endpoints. The current Linked Open Data cloud, although huge in size, offers poor service quality and is inadequately maintained, thus complicating access via SPARQL endpoints. This issue needs to be resolved before the Linked Open Data cloud can achieve its full potential.

by Irene Petrou and George Papastefanatos

Linked Open Data technology is an emerging way of making structured data available on the Web. This project aims to develop a generic methodology for publishing statistical datasets, mainly stored in tabular formats (e.g., csv and excel files) and relational databases, as LOD. We build statistical vocabularies and LOD storage technologies on top of existing publishing tools to ease the process of publishing these data. Our efforts focus on census data collected during Greece’s 2011 Census Survey and provided by the Hellenic Statistical Authority. We develop a platform through which the Greek Census Data are converted, interlinked and published.

by Pierre-Yves Vandenbussche and Bernard Vatant

The “Web of Data” has recently undergone rapid growth with the publication of large datasets – often as Linked Data - by public institutions around the world. One of the major barriers to the deployment of Linked Data is the difficulty data publishers have in determining which vocabularies to use to describe the semantics of data. The Linked Open Vocabularies (LOV) initiative stands as an innovative observatory for the re-usable linked vocabularies ecosystem. The initiative goes beyond collecting and highlighting vocabulary metadata. It now plays a major social role in promoting good practice and improving overall ecosystem quality.

by Maarten Marx

We investigate the coverage of Wikipedia for historical public figures. Unsurprisingly, the probability of a figure having a Wikipedia entry declines with time since the person was active. Nevertheless, two thirds of the Dutch members of parliament that have been active in the last 140 years have a Wikipedia page. The need to link historical figures to existing knowledge bases like Wikipedia/DBpedia comes from current large scale efforts to digitize primary data sources, including proceedings of parliament and historical newspapers. Linking entries to knowledge bases can provide values of key background variables, such as gender, age, and (party) affiliation.

by Renzo Angles, Minh-Duc Pham and Peter Boncz

With inherent support for storing and analysing highly interconnected data, graph and RDF databases appear as natural solutions for developing Linked Open Data applications. However, current benchmarks for these database technologies do not fully attain the desirable characteristics in industrial-strength benchmarks [1] (e.g. relevance, verifiability, etc.) and typically do not model scenarios characterized by complex queries over skewed and highly correlated data [2]. The Linked Data Benchmark Council (LDBC) is an EU FP7 ICT project that brings together a community of academic researchers and industry, whose main objective is the development of industrial-strength benchmarks for graph and RDF databases.

by Nicola Ferro and Gianmaria Silvello

Experimental evaluation of search engines produces scientific data that are highly valuable from both a research and financial point of view. They need to be interpreted and exploited over a large time-frame, and a crucial goal is to ensure their curation and enrichment via inter-linking with relevant resources in order to harness their full potential. To this end, we exploit the LOD paradigm for increasing experimental data discoverability, understandability and re-usability.

by Alexandra Roatiș

The WaRG framework brings flexibility and semantics to data warehousing. The development of Semantic Web data represented within W3C’s Resource Description Framework [1] (RDF), and the associated standardization of the SPARQL query language now at v1.1 has lead to the emergence of many systems storing, querying, and updating RDF. However, as more and more RDF datasets (graphs) are made available, in particular Linked Open Data, application requirements also evolve.

by Wendy Hall, Thanassis Tiropanis, Ramine Tinati, Xin Wang, Markus Luczak-Rösch and Elena Simperl

Linked data technologies provide advantages in terms of interoperability and integration, which, in certain cases, come at the cost of performance. The Web Observatory, a global Web Science research project, is providing a benchmark infrastructure to understand and address the challenges of analytics on distributed Linked Data infrastructures.

by Pierre-Yves Vandenbussche, Aidan Hogan, Jürgen Umbrich and Carlos Buil Aranda

Hundreds of datasets on the Web can now be queried through public, freely-available SPARQL services. These datasets contain billions of facts spanning a plethora of diverse topics hosted by a variety of publishers, including some household names, such as the UK and US governments, the BBC and the Nobel Prize Committee. A Web client using SPARQL could, for example, query about the winners of Nobel Prizes from Iceland, or about national electric power consumption per capita in Taiwan, or about homologs found in eukaryotic genomes, or about Pokémon that are particularly susceptible to water attacks. But are these novel SPARQL services ready for use in mainstream Web applications? We investigate further.

by András Micsik, Sándor Turbucz and Zoltán Tóth

There are a range of problems associated with current Linked Data visualization tools, including lack of genericity and reliance on non-standard dataset endpoint features. These problems hinder the emergence of generic Linked Data browsers and can thus complicate the process of accessing Linked Data. With LODmilla we aim to overcome common problems of Linked Open data (LOD) browsing and to establish an extensible base platform for further evolution of Linked Data browsers.

by George Papastefanatos and Yannis Stavrakas

The recent development of Linked Open Data technologies has enabled large scale exploitation of previously isolated, public, scientific or enterprise data silos. Given its wide availability and value, a fundamental issue arises regarding the long-term accessibility of these knowledge bases; how do we record their evolution and how do we preserve them for future use? Until now, traditional preservation techniques keep information in fixed data sets, “pickled” and “locked away” for future use. Given the complexity, the interlinking and the dynamic nature of current data, especially Linked Open Data, radically new methods are needed.

by Christian Dirschl, Katja Eck and Jens Lehmann

The Linked Data Stack is an integrated distribution of aligned tools that support the whole lifecycle of Linked Data from extraction, authoring/creation via enrichment, interlinking and fusing through to maintenance. A global publishing company represents an ideal recent real-world usage scenario, illustrating the Linked Data Stack and the underlying lifecycle of Linked Data (including data-flows and usage scenarios).

by Minh-Duc Pham and Peter Boncz

The Resource Description Framework (RDF) has been used as the main data model for the semantic web and Linked Open Data, providing great flexibility for users to represent and evolve data without need for a prior schema. This flexibility, however, poses challenges in implementing efficient RDF stores. It i) leads to query plan with many self-joins in triple tables, ii) blocks the use of advanced relational physical storage optimization such as clustered indexes and data partitioning, and iii) the lack of a schema sometimes makes it problematic for users to comprehend the data and formulate queries [1]. In the Database Architecture group at CWI, Amsterdam, we tackle these RDF data management problems by automatically recovering the structure present in RDF data, leveraging this structure both internally inside the database systems (in storage, optimization, and execution), and externally as an emergent schema towards the users who pose queries.

by Miguel A. Martínez-Prieto, Carlos E. Cuesta, Javier D. Fernández and Mario Arias

Linked Open Data has increased the availability of semantic data, including huge flows of real-time information from many sources. Processing systems must be able to cope with such incoming data, while simultaneously providing efficient access to a live data store including both this growing information and pre-existing data. The SOLID architecture has been designed to handle such workflows, managing big semantic data in real-time.

Next issue: July 2021
Special theme:
"Privacy-Preserving Computation"
Call for the next issue
Image ERCIM News 96 epub
This issue in ePub format
Get the latest issue to your desktop
RSS Feed