by Daniele Albanese, Giuseppe Crincoli, Marco De Vincenzi, Giacomo Iadarola, Fabio Martinelli, Ilaria Matteucci and Paolo Mori (CNR-IIT)
Allowing data producers to retain some control of the data they share is of paramount importance to encourage data sharing for collaborative analytics.
E-CORRIDOR [L1] is an EU-funded project aimed at defining a framework for multi-modal transport systems providing secure advanced services to passengers and transport operators. The framework implements collaborative privacy-aware edge-enabled information sharing, analysis and protection as a service. The actors of this framework, either travellers or transport companies, can be seen as information prosumers, i.e. producers and consumers of information. Information could be raw data as well as complex attack indicators that prosumers may wish to share in order to enhance security and safety, or just to get a better transportation service. Figure 1 shows how information analysis and sharing can be related to multiple aspects in multimodal transport. Users can share their travel preferences; cars can be seen as sensor platforms producing several kinds of data; cars can automatically recognise drivers and similarly, traffic jams. We plan to empower users with the capability to share the data they wish as they wish to enable the execution of collaborative data analysis.
Figure 1: E-CORRIDOR operation concept.
The E-CORRIDOR framework is based on the concept of a data sharing agreement (DSA), which is an agreement among a set of parties that regulates the sharing of information among them. The framework provides an infrastructure enabling a DSA’s enforcement when information is shared and analytics are executed on it. Preserving information privacy is a fundamental feature of the E-CORRIDOR framework, because it encourages producers to share their information with the other actors of the system since they still retain some control on the subsequent usage. Information can be analysed either globally (in the cloud) or locally (in edge devices). Local analysis increases privacy although global (with more information) analysis could be more accurate. Our framework has the following key components:
- Information sharing: share information (including security ones) in a controlled manner, ensuring confidentiality and integrity as well as regulation compliance, both at rest and while in transit
- Information analytics: advanced analytics functions for data analytics and correlation identifying threats that hide themselves in the massive usage of services and related number of logs
- Mixture of technologies: enable confidential and collaborative analysis of data, including homomorphic encryption
- Advanced seamless access: mechanisms that take advantage of the analytics and sharing infrastructure to provide continuous authentication and authorization as well as privacy-aware service as privacy-aware data usage control.
Data Sharing Agreements
A DSA is a digital contract that defines a set of constraints to regulate the sharing of data among some parties. The DSA is at the base of the E-CORRIDOR data protection support, since it specifies which actions (e.g., analytics operations) can be performed on each piece of data, which subjects can execute these actions, and which other conditions should be satisfied in order the framework to authorise the execution of such actions. A DSA includes, among the other information, the Policy, which consists of a number of authorisation and prohibition rules expressing constraints concerning the attributes describing the subject and the data, to be enforced on data sharing. Policy rules also includes obligations, that are actions that must be executed (e.g., data anonymisation) before making the data available for being used. Concerning the enforcement time, rules can be of two kinds: pre or ongoing. The formers are evaluated when the E-CORRIDOR user requests the execution of an analytics and concur to decide whether the execution can be started or not. Ongoing rules are evaluated while analytics are in execution and determine whether the execution of such analytics can be continued or must be interrupted because of a policy violation. DSAs allow data producers to express rules expressing constraints on the other pieces of data that are involved in the collaborative analytics with the piece of data the DSA refers to. A graphical representation of a DSA is shown in Figure 2.
Figure 2: Structure of Data Sharing Agreements.
The E-CORRIDOR architecture, shown in Figure 3, is meant at supporting data sharing and collaborative analytics execution while enforcing DSAs. E-CORRIDOR users act as data producers when they upload their data (paired with the related DSA) on the E-CORRIDOR framework, thus creating Data Bundles. E-CORRIDOR users act as data consumers when they request the execution of collaborative analytics by selecting a set of Data Bundles to be used.
Figure 3: E-CORRIDOR Architecture.
Information Sharing Infrastructure (ISI)
The ISI is in charge of managing data, ensuring its secure storage as well as its privacy preserving sharing. To protect confidentiality at rest, when new data is uploaded by data producers, the ISI embeds it in cryptographic containers, called Data Bundles, before being stored on the storage system. Several storage systems can be supported, e.g., local storage or cloud based storage services. Instead, for protecting data privacy when they are shared to perform collaborative analytics, DSAs are embedded in Data Bundles and, every time that an E-CORRIDOR user requests to execute collaborative analytics on a Data Bundle, the corresponding DSA is enforced to check whether such user actually holds the right to exploit such data to perform such analytics. Moreover, the ISI subsystem also allows the integration of customised privacy preserving operations (such as data specific anonymisation operations) following a plugin approach. These operations are executed on the data extracted from Data Bundles when required by the DSAs, before making data available for executing the analytics.
Information Analytics Infrastructure (IAI)
The IAI manages the execution of collaborative analytics on the data shared through the ISI subsystem. E-CORRIDOR users act as data consumers interacting with the IAI to request the execution of collaborative analytics on a given set of Data Bundles, which is typically defined using a query on the metadata paired with the Data Bundles themselves. The IAI invokes the ISI subsystem to search and retrieve such Data Bundles, which are used for executing the requested analytics. When the analytics execution has been completed, the IAI subsystem interacts with the ISI subsystem to create a new Data Bundle in order to make available to E-CORRIDOR users the results of the analytics execution.
Fabio Martinelli, CNR-IIT, Italy