by Alessia Bardi and Luca Frosini (ISTI-CNR)

Research infrastructures (RIs) are “facilities, resources and services used by the science community to conduct research and foster innovation” [1]. Researchers’ needs for digital services led to the realisation of e-Infrastructures, i.e., RIs offering digital technologies for data management, computing and networking. Relevant examples are high speed connectivity infrastructures (e.g., GÈANT), grid computing infrastructures (e.g., European Grid Infrastructure EGI), scholarly communication infrastructures (e.g., OpenAIRE), data e-infrastructures (e.g., D4Science).

Digital humanities infrastructures (DHIs) are e-infrastructures supporting researchers in the field of humanities with a digital environment where they can find and use ICT tools and research data for conducting their research activities. A growing number of DHIs have been realised, most of them targeting a specific sector of humanities, such as ARIADNE [L2] for archeology, EHRI [L3] for studies on the holocaust, Cendari [L4] for history, CLARIN [L5] for linguistic, and DARIAH [L6] for arts and humanities. Thanks to their discipline-specific feature, those DHIs offer specialised services and tools to researchers, who are now demanding support for interdisciplinary research, common solutions for data management, and access to resources that are traditionally relevant to different sectors (e.g., text-mining algorithms traditionally used by linguists can also be useful to historians and social scientists).

One of the main goals of the PARTHENOS project (Pooling Activities, Resources and Tools for Heritage E-research Networking, Optimization and Synergies – EC-H2020-RIA grant agreement 654119) is to bridge existing DHIs by forming a federation where researchers of different sectors of the humanities can collaborate and share data, services and tools in an integrated environment.

PARTHENOS will produce a complete technical framework for the federation, enabling transparent access to resources managed by different DHIs and enabling the creation and operation of virtual research environments [1] where researchers with different backgrounds can collaborate on specific research topics.

The technical framework supports the realisation of the federation by offering tools for:

  • The creation of an homogenous information space where all resources (data, services and tools) of the different DHIs are described according to a common data model.
  • The discovery of available resources.
  • The use of available resources (for download or processing).

The creation of VREs where users can find resources relevant for a research topic, run services, and share the computational results.

Figure 1: Technical framework for DHIs federation.

The technical framework (see Figure 1) includes two main components: the PARTHENOS Content Cloud Framework (CCF) and the Joint Resource Registry (JRR).

The CCF supports the aggregation of metadata about resources from the DHIs of the federation. The PARTHENOS aggregator is realised with the D-NET software toolkit [2], an enabling framework for the realisation of Aggregative Data Infrastructures (ADIs) developed and maintained by CNR-ISTI. D-NET provides functionality for the automatic collection, harmonisation, curation and delivery of metadata coming from a dynamic set of heterogenous data providers. In the context of the PARTHENOS project, D-NET has been configured to collect metadata made available by existing DHIs operated by PARTHENOS partners (namely: ARIADNE, CENDARI, CLARIN, CulturaItalia, DARIAH DE, DARIAH GR/DYAS, DARIAH IT, EHRI, Huma-Num, ILC) and harmonise them according to an extension of the CIDOC-CRM model [L7] [L8] by applying X3ML [L9] mappings. The mapping language, editor and execution engine are realised and maintained by the Greek partner FORTH. Aggregated content is then published via different endpoints, supporting a set of (de-facto) standard protocols for metadata search (Solr API, SPARQL) and exchange (OAI-PMH).

The aggregated content is also ingested into the Joint Resource Registry, which exposes an end-user GUI (Resource Catalogue) and a machine-oriented API (Resource Registry) for resource discovery. Data and services registered in the JRR become discoverable by and accessible to users and other services of the federation. Moreover JRR provides functionality for infrastructure management. For example, a user can run a CLARIN service for full-text mining on a dataset of medieval full-texts available in the CENDARI DHI in a transparent way. Computational results can be easily stored and shared with a selection of colleagues or publicly, by publishing them into the JRR.

The JRR is based on the gCube enabling technology [3], an open-source software toolkit used for building and operating hybrid data infrastructures [4] enabling the dynamic deployment of virtual research environments by favouring the realisation of reuse oriented policies. gCube is developed and maintained by CNR-ISTI.

The Parthenos technical framework is currently at the beta stage and operated on the D4Science infrastructure [L10] at the Institute of Information Science and Technologies of the Italian National Research Council (ISTI-CNR). Representatives of the consortium are actively preparing mappings for metadata, selecting data and services to share and setting up VREs. As of August 2017, two VREs have been created: one includes services for natural language processing and semantic enrichment of textual data; the other is meant for the integration of reference resources. In the coming months, the framework will be deployed in a production environment and assessed by a selection of humanities researchers in the consortium. We plan to open the framework to all researchers of DHIs in the consortium by the end of the project (April 2019).

References:  
[1] L. Candela et al.: “Virtual research environments: an overview and a research agenda”, Data Science Journal, 12, GRDI75-GRDI81, 2013. http://doi.org/10.2481/dsj.GRDI-013
[2] P. Manghi et al. “The D-NET software toolkit: A framework for the realization, maintenance, and operation of aggregative infrastructures”, Program, Vol. 48 Issue: 4, pp.322-354, 2014 doi: https://doi.org/10.1108/PROG-08-2013-0045
[3] L. Candela, P. Pagano: “Cross-disciplinary data sharing and reuse via gCube”, in: ERCIM News, Issue 100, January 2015. https://kwz.me/hO7
[4] L. Candela et al.: “Managing Big Data through Hybrid Data Infrastructures”, in ERCIM News, Issue 89, April 2012. https://kwz.me/hO8

Links:
[1] https://kwz.me/hO9
[2] http://www.ariadne-infrastructure.eu/
[3] https://www.ehri-project.eu/
[4] http://www.cendari.eu
[5] https://www.clarin.eu
[6] http://www.dariah.eu/
[7] http://www.cidoc-crm.org/
[8] https://kwz.me/hOf
[9] https://kwz.me/hOj
[10] https://parthenos.d4science.org/

Please contact:
Alessia Bardi, Luca Frosini
ISTI-CNR, Italy
This email address is being protected from spambots. You need JavaScript enabled to view it., This email address is being protected from spambots. You need JavaScript enabled to view it.

Next issue: January 2025
Special theme:
Large-Scale Data Analytics
Call for the next issue
Image ERCIM News 111 epub
This issue in ePub format

Get the latest issue to your desktop
RSS Feed