by Costantino Thanos and Andreas Rauber

Research data are essential to all scientific endeavours. Openness in the sharing of research results is one of the norms of modern science. The assumption behind this openness is that scientific progress requires results to be shared within the scientific community as early as possible in the discovery process.

by Christine L. Borgman

Data sharing has become policy enforced by governments, funding agencies, journals, and other stakeholders. Arguments in favor include leveraging investments in research, reducing the need to collect new data, addressing new research questions by reusing or combining extant data, and reproducing research, which would lead to greater accountability, transparency, and less fraud. Arguments against data sharing rarely are expressed in public fora, so popular is the idea. Much of the scholarship on data practices attempts to understand the socio-technical barriers to sharing, with goals to design infrastructures, policies, and cultural interventions that will overcome these barriers.

by Tomasz Miksa and Andreas Rauber

Sharing and reuse of data is just an intermediate step on the way to reproducible computational science. The next step, sharing and reuse of processes that transform data, is enabled by process management plans, which benefit multiple stakeholders at all stages of research.

by Marie Sandberg, Rob Baxter, Damien Lecarpentier and Paweł Kamocki

Facilitating open access to research data is a principle endorsed by an increasing number of countries and international organizations, and one of the priorities flagged in the European Commission’s Horizon 2020 funding framework [1][2]. But what do researchers themselves think about it? How do they perceive the increasing demand for open access and what are they doing about it? What problems do they face, and what sort of help are they looking for?

by Massimiliano Assante, Leonardo Candela, Paolo Manghi, Pasquale Pagano, and Donatella Castelli

The purpose of data publishing is to release research data for others to use. However, its implementation remains an open issue. ‘Science 2.0 Repositories’ (SciRepos) address the publishing requirements arising in Science 2.0 by blurring the distinction between research life-cycle and research publishing. SciRepos interface with the ICT services of research infrastructures to intercept and publish research products while providing researchers with social networking tools for discovery, notification, sharing, discussion, and assessment of research products.

by Anna Basoni, Stefano Menegon and Alessandro Sarretta

A thorough understanding of marine and ocean phenomena calls for synergic multidisciplinary data provision. Unfortunately, much scientific data is still kept in drawers, and in many cases scientists and stakeholders are unaware of its existence. At the same time, researchers lament the time consuming nature of data collection and delivery. To overcome barriers to data access, the RITMARE project issued a data policy document, an agreement among participants on how to share the data and products either generated by the project activities or derived from previous activities, with the aim of recognizing the effort involved.

by Keith G. Jeffery and Rebecca Koskela

RDA is all about facilitating researchers to use data (including scholarly publications and grey literature used as data). This encompasses data collection, data validation, data management (including preservation/curation), data analysis, data simulation/modelling, data mining, data visualisation and interoperation of data. Metadata are the key to all of these activities because they present to persons, organisations, computer systems and research equipment a representation of the dataset so that the dataset can be acted upon.

by Stefano Nativi, Keith G. Jeffery and Rebecca Koskela

RDA is about interoperation for dataset re-use. Datasets exist over many nodes. Those described by metadata can be discovered; those cited by publications or datasets have navigational information. Consequentially two major forms of access requests exist: (1) download of complete datasets based on citation or (query over) metadata and (2) relevant parts of datasets instances from query across datasets.

by Catherine Jones, Brian Matthews and Antony Wilson

Data publication and sharing are becoming accepted parts of the data ecosystem to support research, and this is becoming recognised in the field of ‘facilities science’. We define facilities science as that undertaken at large-scale scientific facilities, in particular neutron and synchrotron x-ray sources, although similar characteristics can also apply to large telescopes, particle physics institutes, environmental monitoring centres and satellite observation platforms. In facilities science, a centrally managed set of specialized and high value scientific instruments is made accessible to users to run experiments which require the particular characteristics of those instruments

Mirko Manea and Marinella Petrocchi

Sharing data among groups of organizations and/or individuals is essential in a modern web-based society, being at the very core of scientific and business transactions. Data sharing, however, poses several problems including trust, privacy, data misuse and/or abuse, and uncontrolled propagation of data. We describe an approach to preserve privacy whilst data sharing based on scientific Data Sharing Agreements (DSA).

by Leonardo Candela and Pasquale Pagano

Data sharing has been an emerging topic since the 1980’s. Science evolution – e.g. data-intensive, open science, science 2.0 – is revamping this discussion and calling for data infrastructures capable of properly managing data sharing and promoting extensive reuse. ‘gCube’, a software system that promotes the development of data infrastructures, boasts the distinguishing feature of providing its users with Virtual Research Environments where data sharing and reuse actually happens.

by Thilo Stadelmann, Mark Cieliebak and Kurt Stockinger

In recent years large amounts of data have been made publicly available: literally thousands of open data sources exist, with genome data, temperature measurements, stock market prices, population and income statistics etc. However, accessing and combining data from different data sources is both non-trivial and very time consuming. These tasks typically take up to 80% of the time of data scientists. Automatic integration and curation of open data can facilitate this process.

by Juan Bicarregui and Brian Matthews

Today’s scientific research is conducted not just by single experiments but rather by sequences of related experiments or projects linked by a common theme that lead to a greater understanding of the structure, properties and behaviour of the physical world. This is particularly true of research carried out on large-scale facilities such as neutron and photon sources where there is a growing need for a comprehensive data infrastructure across these facilities to enhance the productivity of their science.

by Paulo Carvalho, Patrik Hitzelberger and Gilles Venturini

Open Data (OD) is one of the most active movements contributing to the spread of information over the web. However, there is no common standard to publish datasets. Data is made available by different kind of entities (private and public), in various formats and according to different cost models. Even if the information is accessible, it does not mean it can be reused. Before being able to use it, an aspiring user must have knowledge about its structure, location of meaningful fields and other variables. Information visualization can help the user to understand the structure of OD datasets.

by Robert Viseur and Nicolas Devos

The term ‘open data’ refers to “information that has been made technically and legally available for reuse”. Open data is currently a hot topic for a number of reasons, namely: the scientific community is moving towards reproducible research and sharing of experimental data; the enthusiasm, especially within the scientific community, for the semantic web and linked data; the publication of datasets in the public sector (e.g., geographical information); and the emergence of online communities (e.g., OpenStreetMap). The open data movement engages the public sector, as well as business and academia. The motivation for opening data, however, varies among interest groups.

Next issue: April 2021
Special theme:
"Brain-Inspired Computing"
Call for the next issue

Image ERCIM News 100 epub
This issue in ePub format

Get the latest issue to your desktop
RSS Feed