Back Issues Online
Back Issues Online

by Enol Fernández and Ilaria Fava (EGI Foundation) 

EOSC Data Commons is a new European initiative to build the next generation of federated research data infrastructure. Researchers often know the data they need exists but cannot find it or access it easily. EOSC Data Commons tackles this fragmentation head-on by making cross-border discovery and reuse far more practical.

EOSC Data Commons [L1] is a recently launched project that aims to shape the next generation of research data infrastructure in Europe. The project contributes to the European Open Science Cloud (EOSC) [1] by fostering seamless access to high-quality, interoperable research data and services. In practice, this means helping researchers in different fields actually work on the same data without spending weeks resolving format, access, or compatibility issues.

Coordinated by the EGI Foundation, EOSC Data Commons brings together a consortium of European infrastructure and renowned data repositories, virtual research environment providers and data analysis workflow providers [L2]. It links national and thematic infrastructures across Europe, creating a federated system that works across borders but respects how individual providers operate.

Why EOSC Data Commons is needed
Research data volumes have grown dramatically over the last decade, driven by advances in digital technologies, large-scale instruments, simulations, and data-intensive methods such as artificial intelligence. While this data explosion holds enormous potential, much of the resulting data remains siloed, heterogeneous, and difficult to discover, access, or reuse. As a consequence, valuable scientific insights often remain untapped, and reproducibility and cross-disciplinary collaboration are hindered.

The project addresses these challenges by creating a trusted, distributed environment that brings together data, tools, and compute resources across disciplines and national boundaries. By reducing fragmentation and improving interoperability, the project aims to make research data more FAIR [2] and support the full research data lifecycle, from deposition through long-term preservation and reuse.

What EOSC Data Commons is building
At the core of EOSC Data Commons is the development of two complementary services.

  1. EOSC Matchmaker [L3] enables researchers to discover datasets, tools, and services across multiple scientific domains. This discovery layer is combined with a catalogue of analytics tools and execution services, allowing tools to be deployed close to the data.
  2. EOSC Data Player [L4] complements this by addressing interoperability and integration of execution platforms and data providers. It provides harmonised APIs, shared metadata specifications, and common mechanisms for authentication, authorisation, and provenance tracking. In practical terms, this means a researcher can combine data and tools from different providers without manually reconciling metadata formats or authentication systems, while still keeping full traceability.

The idea is to let researchers move from discovery to analysis and preservation without switching between disconnected systems. By bringing computation to the data, EOSC Data Commons services reduce data movement, improve efficiency, and enable scalable analysis across distributed infrastructures.

Technical Approach
EOSC Data Commons employs a range of advanced technical approaches to realise its vision. These include distributed architectures that respect the autonomy of data providers while enabling cross-infrastructure integration, the use of semantic technologies and knowledge graphs for rich metadata representation and discovery; and standardised interfaces to support interoperability. The project adopts community standards such as RO-Crate [3] for packaging research artefacts and their metadata in a machine-actionable, interoperable manner.

The project also explores machine-assisted methods for search and analysis, particularly where datasets are too large or complex for manual exploration. EOSC Data Commons aims to help researchers extract insights more efficiently from large, complex datasets while improving reproducibility and transparency of research results. Figure 1 illustrates the architecture of EOSC Data Commons.

Figure 1: The EOSC Data Commons architecture.
Figure 1: The EOSC Data Commons architecture.

Orientation and Validation Through Use Cases
EOSC Data Commons is strongly user-driven. The project is validated through real-world use cases [L5] spanning a wide range of scientific domains, including life sciences, social sciences, environmental research, physics, and beyond. These use cases ensure that the developed services address concrete research needs and demonstrate cross-disciplinary relevance. By working closely with research communities, data repository managers, and service providers, the project aligns technical development with practical requirements and established research workflows.

Who Can Benefit and Participate
EOSC Data Commons is designed to benefit and engage a broad range of stakeholders. This includes researchers seeking interoperable access to data and compute resources; data repository managers and infrastructure providers interested in joining a federated European ecosystem and expanding its outreach to new communities beyond existing boundaries; tool developers and analytics service providers looking for integration opportunities; and research organisations and policy-makers committed to open science, data reuse, and FAIR principles.

The project is actively offering opportunities for participation and collaboration, strengthening the EOSC ecosystem.[L6].

Timeline and Future Activities
Started in April 2025, EOSC Data Commons is an ongoing effort whose results will evolve over the course of the project. Future activities include further integration of data repositories and services, expansion of supported use cases, and continued refinement of the EOSC Matchmaker and EOSC Data Player services based on user feedback.
In the longer term, EOSC Data Commons aims to provide a sustainable foundation for Europe’s open research data landscape, supporting innovation, collaboration, and scientific excellence well beyond the project’s lifetime.

Links:
[L1] https://www.eosc-data-commons.eu/  
[L2] https://www.eosc-data-commons.eu/about  
[L3] https://www.eosc-data-commons.eu/service/eosc-matchmaker/  
[L4] https://www.eosc-data-commons.eu/service/eosc-data-player/  
[L5] https://www.eosc-data-commons.eu/use-cases 
[L6] https://www.eosc-data-commons.eu/open-call 
 
References:
[1] Horizon Europe Co-programmed Partnership for the European Open Science Cloud (EOSC), “Strategic Research and Innovation Agenda (SRIA) of the European Open Science Cloud (EOSC),” ver. 1.3, Zenodo, 2024. doi: 10.5281/zenodo.17582648.
[2] M. Wilkinson, et al., “The FAIR Guiding Principles for scientific data management and stewardship,” Sci. Data, vol. 3, p. 160018, 2016, doi: 10.1038/sdata.2016.18.
[3] S. Soiland-Reyes, et al., “Packaging research artefacts with RO-Crate,” Data Science, vol. 5, no. 2, pp. 97–138, 2022, doi: 10.3233/DS-210053.

Please contact: 
Enol Fernández 
EGI Foundation, The Netherlands
This email address is being protected from spambots. You need JavaScript enabled to view it.
 
 

Next issue: July 2026
Special theme:
E-values: Statistical Testing for the 21st Century
Call for the next issue
Image ERCIM News 144 cover
This issue in pdf

 

Image ERCIM News 144 epub
This issue in ePub format

Get the latest issue to your desktop
RSS Feed