by Roberto Trasarti, Katia Genovali and Beatrice Rapisarda (CNR-ISTI)
In our complex society, the ethical use and storage of data are essential for the scientific community and institutions to build trust in citizens. SoBigData is a pan-European and cross-disciplinary Research Infrastructure on social mining and data analytics, which bases its research activities on ethics and fairness. SoBigData doesn't apply science only to the most challenging societal issues; in fact, it provides data and facilities to researchers and services to firms and public administrations to develop innovative tools and respond to societal needs. Above all, it works to create an ecosystem for data research that respects the founding principles of Europe for the benefit of the whole community.
Every moment of our lives, we release data through our devices: computers, tablets and mobile phones, but also smartwatches, credit cards and internet access leave behind a wealth of data that records our daily activities. This data can reveal a lot about an individual's habits, movements and personal preferences. This is why it needs to be managed in the most ethical way possible.
Ethics is one of the key principles at the heart of SoBigData [L1], a Research Infrastructure (RI) dedicated to big data and social mining that spans the most pressing societal issues : sustainable cities, demography and economics, online misinformation and migration. The social impact of artificial intelligence, sports data science and medicine can also be the subject of a SoBigData data scientist.
In 2017 SoBigData started including top research centres around Europe to create a technological common ground and provide all researchers from Europe access to them. From there, the consortium behind the (RI) grew by including more and more institutions and it became a multidisciplinary network of computer scientists and social scientists studying different aspects of society . Today, SoBigData researchers, more than a hundred people from 29 data sites across Europe, are working together to answer challenging research questions about society. The aim is to create an ecosystem in big data and social mining research where values and standards of privacy, fairness, transparency and pluralism coexist (Figure 1).
Figure 1: The SoBigData RI structure, centred on the Catalogue, allows the user to find the proper resources in the infrastructure. It includes the Exploratories, virtual environments where new research is developed (Demography, Economy and Finance 2.0, Migration Studies, Network Medicine, Social Impact of AI and explainable ML, Societal Debates and Misinformation, Sports Data Science and Sustainable Cities for Citizens), the SoBigData Lab, hosting resources available to run experiments on the cloud, and the High Performance Computing (HPC) Portal with useful information to access European facilities. SoBigData RI can be accessed virtually, through its website and gateway, [L1] or in person, thanks to its Transnational Access Programme visiting our nodes.
The SoBigData RI includes numerous laboratories around Europe, e.g. the Institute of Information Science and Technologies “Alessandro Faedo” of the National Research Council (CNR-ISTI) in Italy, the project coordinator, creating the network of European nodes of the e-infrastructure. The infrastructure is based on the most advanced techniques of big data analysis combining artificial intelligence and social issues, a highly strategic sector in today's European and global economy.
The SoBigData RI platform, which serves more than 10 thousand users, enables researchers to perform large-scale social mining experiments in computational social sciences, digital humanities, urban planning, welfare, migration, and sport and health, within the legal and ethical framework of responsible data science. The RI offers innovative and free services for its users, such as the SoBigData Lab, where they can design and develop new algorithms in an interactive environment, with libraries ready to use, or deploy a method or run experiments on its cloud engine.
All the methodologies, datasets, scientific publications, and training materials available are collected in an omni-comprehensive navigable catalogue that aims at providing an easy way to search, find and reproduce innovative social mining experiments.
In addition to online services, SoBigData offers a mobility programme under the European Transnational Access Programme, allowing users to visit and collaborate with top researchers at its nodes across Europe.
By its very nature, the infrastructure encourages and enables users to carry out novel interdisciplinary studies based on ethical and privacy principles. In fact, all services offered are subject to a rigorous ethical assessment in order to inform and promote European principles in social analysis. The ethical use of data and research results is the fundamental principle of SoBigData, whose developers and users respect the FAIR (Findable, Accessible, Interoperable, Reusable) and FACT (Fair, Accurate, Confidential and Transparent) principles and the ELSEC (Ethical, Legal, Social, Economic and Cultural) perspective to ensure the qualitative growth of RI.
But scientists are not the only beneficiaries of SoBigData; it is also a data-driven innovation accelerator that facilitates collaboration between researchers, industry and start-ups to develop pilot and proof-of-concept projects. In fact, companies are among the stakeholders of the SoBigData Infrastructure, as innovation in big data and artificial intelligence is one of the main challenges for the development of a true Industry 4.0 and an even more digitised and intelligent society.
SoBigData RI has been selected by the European Strategy Forum on Research Infrastructures (ESFRI), as part of the Roadmap 2021. This puts the RI in a long-term perspective, with the aim of becoming the European platform for social mining and a service provider for the European Open Science Cloud (EOSC) initiative. SoBigData RI can be an important player in the European scenario, creating the next generation of responsible data scientists and providing services to connect researchers, industry, and policymakers to improve the quality of life.
 R. Trasarti, V. Grossi, M. Natilli, et al., “SoBigData RI: European integrated infrastructure for social mining and big data analytics”in SEBD, 2022, pp. 117–124.
 V. Grossi, F. Giannotti, D. Pedreschi, et al, “Data science: a game changer for science and innovation”, Int. J. Data Sci. Anal., vol. 11(4), pp. 263–278 (2021).
Roberto Trasarti, CNR-ISTI, Italy