In recent years scientists have been rethinking research workflows in favour of innovative paradigms to support multidisciplinary, computationally-heavy and data-intensive collaborative activities. In this context, e-Infrastructures can play a crucial role in supporting not only data capture and curation but also data analysis and visualization. Their implementation demands seamless and on-demand access to computational, content, and application services such as those typified by the Grid and Cloud Computing paradigms. gCube is a software framework designed to build e-Infrastructures supporting Virtual Research Environments, ie on-demand research environments conceived to realise the new science paradigms.
The eScience community is currently examining the feasibility of setting up innovative Virtual Research Environments (VREs) to meet the requirements of collaborative activities. VREs are designed to support both small and large-scale computationally-intensive, data-intensive and collaboration-intensive tasks, and to serve research communities potentially distributed over multiple domains and institutions.
A promising approach for the building and operation of VREs is based on e-Infrastructures, ie frameworks that enable secure, cost-effective and on-demand resource sharing across organizational boundaries. An e-Infrastructure can be seen as a “mediator”, accommodating resource sharing among resource providers and consumers, either human or inanimate. Resources are intended as generic entities, either physical (eg storage and computing resources) or digital (eg software, processes, data), that can be shared and can interact with other resources to synergistically provide various types of service. A service-based paradigm is needed in order to share/reuse these resources. The e-Infrastructure layer allows resource providers to “sell” their resources, and resource consumers to “buy” them and to use them to build their applications. It also provides organizations with logistic and technical support for application building, maintenance, and monitoring.
The e-Infrastructure vision shares many commonalities with Grid Computing and Cloud Computing. All three aim to reduce computing costs via economies of scale. They all attempt to achieve this objective by managing a pool of abstracted and virtualized resources and offering on demand computing power, storage facilities and services to “external” customers over the internet. The differences mainly reside in the services they offer, the business models, and the technologies that characterize them.
Screenshots of gCube based Virtual Research Environments.
gCube is a software system specifically conceived to develop and operate large scale e-Infrastructures, enabling the declarative definition and automatic deployment and operation of VREs.
gCube facilities for e-Infrastructure development include a rich array of mediator services for interfacing with existing infrastructure enabling technologies including grid (eg gLite/EGEE), cloud (eg Hadoop) and data source (eg OAI-PMH) oriented approaches. Via these mediator services, the storage facilities, processing facilities and data resources of the external infrastructures are conceptually unified to become gCube resources. Facilities for deploying gCube Nodes, ie servers offering storage and computing facilities (similar to the Infrastructure as a Service Cloud) are also offered together with the dynamic deployment of gCube services (similar to the Platform as a Service Cloud). These resources are complemented by the Software as a Service Cloud approach, ie offering software frameworks for data management, data integration, workflow definition and execution, information retrieval, and user interface building.
By relying on this impressive amount of resources and services, gCube based e-Infrastructures enable scientists to declaratively and dynamically build the VREs they need while abstracting on the implementation details. gCube technology implements a user friendly Platform as a Service Cloud “function” where the content, application services, and computing resources needed by a scientist are automatically aggregated and deployed, and made available through a web based interface. The aggregated resources are also monitored to guarantee the VRE service.
gCube technology is now serving a number of challenging scientific domains, for example marine biologists generating model-based large-scale predictions of natural occurrences of marine species, High Energy Physicists mining bibliometric data and producing hybrid metrics on the entire corpus of their literature, and fishery statisticians managing and integrating catch statistics.
gCube is the result of the collaborative efforts of researchers and developers from academic and industrial research centres including the Institute of Information Science and Technologies ISTI-CNR (IT), University of Athens (GR), University of Basel (CH), Engineering Ingegneria Informatica SpA (IT), University of Strathclyde (UK), CERN European Organization for Nuclear Research (CH), 4D SOFT Software Development Ltd (HU). Its development has been partially supported by the DILIGENT project (FP6-2003-IST-2, Contract No. 004260), the D4Science project (FP7-INFRA-2007-1.2.2, Contract No. 212488), and the D4Science-II project (FP7-INFRA-2008-1.2.2, Contract No. 239019)