Open-Access Repositories and the Open Science Challenge

by Leonardo Candela, Paolo Manghi, and Donatella Castelli (ISTI-CNR)

The open-access movement is promoting free-of-restriction access to, and use of, research outcomes. It is a key aspect of the open-science movement, which is pushing for the research community to go ‘beyond papers’. This new paradigm calls for a new generation of repositories that are: (i) capable of smartly interfacing with the wealth of research infrastructure and services that scientists rely on, thus being able to intercept and publish research products, (ii) able to provide researchers with social networking tools for discovery, notification, sharing, discussion, and assessment of research products.

The landscape of scientific research has changed dramatically in the last few years. The forces driving the change include both new technology (namely ICT infrastructures and services) and the open-science movement that is supporting and encouraging an open-access-driven dissemination and exploitation of virtually every research product worth sharing; not only papers but datasets, software, notebooks and every computational object produced in the course of research.

However, the evolution is still underway. ICT infrastructures are quite diffuse among research communities and researchers, and the large majority of daily scientific activities relies on them, yet a gap remains between the ‘places’ where research is conducted and the ‘places’ where its dissemination and communication happen. This gap, which originates from the long tradition of paper-driven scientific communication that still characterises science, is one of the major barriers to overcome before open science becomes a reality. The traditional means of scientific communication are so ingrained that, when called upon to manage a new type of scientific product, i.e., the ‘research data’, the scientific community responded by proposing existing approaches such as specific journals, i.e., data journals [2], and/or repositories, i.e., data repositories [3]. Such approaches do not fit well with the entire spectrum of research products envisaged, for which effective interpretation, evaluation, and reuse can only be ensured if publishing includes the properties of ‘within’ the environment (and context) from which they originate and ‘during’ the research activity.

Motivated by these observations we envisioned a completely new kind of open access / science repository, SciRepo [1]. This is a sort of ‘overlay repository’ that is expected to sit on top of the research environment / infrastructure that researchers use to dynamically collect research artefacts (a) as soon as they are produced, (b) without needing to spend effort to repurpose them for publication purposes, and (c) fully equipped with their ‘context’, i.e., the wealth of information surrounding the artefact and key for its understanding. SciRepo’s distinguishing features include: (a) hooks interfacing with ICT services to intercept the generation of products and to publish such products, i.e., to make them discoverable and accessible to other researchers; (b) provision of repository-like tools so that scientists can access and share research products generated during their research activities; (c) social networking based practices to modernise (scientific) communication both intra-community and inter-community, e.g., posting rather than deposition, ‘like’ and ‘open discussions’ for quality assessment, sharing rather than dissemination.

SciRepo repository-oriented facilities are largely based on the rich information graph characterising every published product. They include search and browse allowing search by product typology, but also permitting navigation from research activities to products and related products. Ingestion facilities are provided, allowing scientists to manually or semi-automatically upload ‘external’ products into the repository and associate them with a research activity, thus including them in the information graph. Ingestion allows scientists to complete the action of publishing a research activity with all products that are connected to it but generated out of the boundaries of the community. The way scientists or groups of scientists can interact with products (access and reuse them) is ruled by clear rights management functionalities. Rights are typically assigned when products are generated or ingested by scientists, but can vary over time.

SciRepo collaboration-oriented facilities include typical social networking facilities such as the option to subscribe to events that are relevant to research activities and products, and be promptly notified, e.g., the completion of a workflow execution, the generation of datasets that conform to a particular criteria. Users can reply to posts and, most importantly, can express opinions on the quality of products, e.g., ‘like’ actions or similar. SciRepo thus represents a step towards truly ‘open’ peer-review. More sophisticated assessment/ peer-review functionalities (single/double blind) can be supported, in order to provide more traditional notions of quality. Interestingly, the posts themselves represent a special type of product of the research activity and are searchable and browsable in the information graph.

References:
[1] M. Assante et al.: “Science 2.0 Repositories: Time for a Change in Scholarly Communication”, D-Lib Magazine. 21 (1/2), (2015), doi: 10.1045/january2015-assante
[2] L. Candela et al.: “Data Journals: A Survey”, Journal of the Association for Information Science and Technology. 66 (1): 1747–1762, 2015), doi:10.1002/asi.23358
[3] M. Assante et al.: “Are Scientific Data Repositories Coping with Research Data Publishing?", Data Science Journal. 15, 2016, doi:10.5334/dsj-2016-006/

Please contact:
Leonardo Candela, ISTI-CNR, Italy
This email address is being protected from spambots. You need JavaScript enabled to view it.