by George Bruseker (ICS-FORTH), László Kovács (MTA-SZTAKI), Franco Niccolucci (University of Florence)
This special theme “Digital Humanities” of ERCIM News is dedicated to new projects and trends in the interdisciplinary domain between computer science and the humanities.
Just as experimental research is based on reasoning over experiments, research in the humanities is based on reasoning over sources, which may be textual, material or intangible. Digital humanities (DH) is the interdisciplinary research field that studies how computer science may support such investigations by creating tools and tailoring computer technology to the specific needs of humanities research while also addressing the methodological issues that arise in relation to the adoption of an intensive use of such digital means. This issue is dedicated to showcasing cutting edge work being undertaken in the domain of DH in the areas of: digital source indexing and analysis, information management strategy, research infrastructure development, 3D analysis techniques, and information visualisation and communication.
Digital Source Indexing and Analysis
Following on from an earlier period of research and investment in digitisation, when huge amounts of analogue content were turned into digital products, the research frontier in DH has now shifted to an interest in automatic or semi-automatic ways of dealing with digital sources. Such sources include both the aforementioned digitised materials and also, increasingly, born-digital texts and other digital media such as audio, images and video. A key research challenge is to create methodologies and tools for finding the needle in the proverbial haystack of millions of poorly indexed files. At present, hot topics include the Optical Character Recognition (OCR) of digitised manuscripts and the parallel techniques of Speech Recognition, as well as the use of Named Entity Recognition (NER) on the resulting digital text files. This is the subject of the article, “In Codice Ratio: Scalable Transcription of Vatican Registers”, by Firmani et al., which proposes a supervised NER using mixed algorithmic and crowdsourcing approaches, as well as of Felicetti’s paper, “Teaching Archaeology to Machines: Extracting Semantic Knowledge from Free Text Excavation Reports” and Brouwer’s, “MTAS – Extending Solr into a Scalable Search Solution and Analysis Tool on Multi-Tier Annotated Text”. Two articles address multimedia sources, presenting innovative solutions to retrieval and discovery. Dologlou at al. deal with other audio and visual sources in their, “Phonetic Search in Audio and Video Recordings” while Köhler et al. address speech recognition and analysis in the contribution, “KA3: Speech Analytics for Oral History and the Language Sciences”. Addressing the general question of querying and discovery in large datasets in the humanities, Devezas et al. introduce a perspective and strategy on use of integration of information using graph technology, “Graph-Based Entity-Oriented Search: Imitating the Human Process of Seeking and Cross Referencing Information”. Another increasingly important topic of research arising in this area relates to fact checking and determining the veracity of claims made in sources as well as impact of arguments in social contexts. The contributions of Manolescu in, “ContentCheck: Content Management Techniques and Tools for Fact-checking” and Heder et al. in, “Argumentation Analysis of Engineering Research” offer perspectives on how to critically assess structured resources through pattern recognition.
Information Management Strategy
The questions of the long-term control of this ‘needle in the haystack’ problem and the efficient use of research data sources raise general methodological issues of how to create robust information management strategies in the humanities. This area of research addresses the rapid expansion of digital sources in multiple formats that cover both broader and more precise topics and the question how we can create long-term sustainable access for the reuse of resources in a way that promotes accessibility and the quality necessary to support academic research. Part of the question here is how to create and successfully use common or transparent expressions/translations of data across domains, as well as how to create, share and properly use common vocabularies for describing data. Advances in the areas of vocabularies and semantics are reported in Daskalaki’s et al., “A Back Bone Thesaurus for Digital Humanities” and in Bruseker et al., “Meeting the Challenges to Reap the Benefits of Semantic Data in Digital Humanities”. Beyond the data question, however, lie practical and social dimensions to the problem of long-term data management, a question of understanding and formalising procedures and protocols in a new digital research environment and putting them efficiently in place in information systems in the present research economy. Clivaz et al. in, “HumaReC: Continuous Data Publishing in the Humanities” and Basset’s, “A Data Management Plan for Digital Humanities: the PARTHENOS Model” provide views and answers on how to meet the challenges, obligations and opportunities that arise for digital humanists creating digital archives that are re-usable by others in an open access framework. Another area of important research is in supporting trust in digital resources. This area is taken up by van Ossenbruggen in, “Trusting Computation in Digital Humanities Research”. Finally, a perennial area demanding new development and imagination lies in creating tools that allow researchers to generate data meeting the above criteria in a non-onerous manner.
Research Infrastructure Development
In order to support forward looking, comprehensive and critical approaches to these issues, the European Commission supports a great part of the funding for digital humanities research, especially through the Infrastructures Programme. In particular, this programme supports the creation of research infrastructures (RI), which are networks of facilities, resources and services offered to a research community to support and catalyse their work. Following the roadmap created by the ESFRI (European Strategy Forum on Research Infrastructures), some of these RIs are upgraded and designated as an ERIC (European Research Infrastructure Consortium), according to the recommendation of a panel of experts. An ERIC is a transnational institution tasked with managing an RI and fostering innovation in the related research field. This issue brings news on the progress of a number of ERICS within the humanities domain. Edmond’s et al. report on the activity of DARIAH (Digital Research Infrastructure for the Arts and the Humanities) in, “The DARIAH ERIC: Redefining Research Infrastructure for the Arts and Humanities in the Digital Age”, while the digital infrastructure of E-RIHS (European Research Infrastructure for Heritage Sciences) is described in Pezzati’s, “DIGILAB, a New Infrastructure for Heritage Science”. Meanwhile, Bardi et al. present research into the design and implementation of a generalised information architecture for digital humanities in, “Building a Federation of Digital Humanities Infrastructures”. Finally, in “Knowledge Complexity and the Digital Humanities: Introducing the KPLEX Project”, Edmond et al. announce interesting new research in the context of such RIs taking an explicitly humanities grounded critical look on the concept of ‘data’ in the first place and how this affects what may or may not be digitized, and the manner in which it is perceived or used.
3D Analysis Techniques
Turning to the application of digital methods to the study of material culture (e.g., man-made objects or architecture), we find continuous innovation in the application of digital techniques in order to try to better understand extant objects or to produce academically sourced and grounded representations of now lost heritage. In this domain, the use of 3D imaging techniques and inventing means of analytically applying these to heritage research is a key area of innovation. In this context, Hanif’s et al.’s paper, “Restauration of Ancient Documents Using Sparse Image Representation”, presents a means to link research on documents with research on the matter on which they are recorded, allowing the virtual restoration of ancient documents. Mature research on the application of 3D technology to monuments is represented in Wall’s et al., “The Virtual St Paul’s Cathedral Project”, which reports on the virtual reconstruction of St Paul’s based on historical sources. Meanwhile, Barreau’s et al., “Immersive Point Cloud Manipulation for Cultural Heritage Documentation” presents an innovative application to use 3D models in a VR platform in order to allow archaeologists to work directly with such products in their research and reporting activities. Finally, in, “Physical Digital Access inside Archaeological Material” Nicolas et al. demonstrate the uses of 3D imaging in conducting non-destructive research on heritage materials. Alliez et al. in, “Culture 3D Cloud” meanwhile present a platform for cloud computing of 3D objects both for enabling their online analysis as well as communicating them to other researchers and the public.
Information Visualization and Communication
Indeed, digital humanities research does not take place in a bubble but, thanks in no small part to its form, is inherently suited to communication to a broader public, both of researchers in other disciplines and generally interested parties. Consequently, considerable research effort is being put into the question of how to present and make digital heritage and humanities content understandable and accessible to this wider audience. Papers addressing the communication of history, heritage or museum exhibits with digital tools in this issue include: Morillo’s et al., “Re-Interpreting European History through Technology: The CrossCult Project”, and Micsik’s et al., “Cultural Opposition in former European Socialist Countries: Building the COURAGE Registry”, which look at means of gathering and presenting heretofore under-represented historical information for researchers and the public. A series of contributions also look at innovative ways to explore digital humanities datasets This research takes many directions, from exploring the users of augmented reality as seen in Tamisier’s et al., “Locale, an Environment-Aware Storytelling Framework Relying on Augmented Reality”, to the application of virtual reality to connect times and space as presented by Koebel’s et al., “The ‘Biennale 4D’ Project” to explorations of new techniques of exploring graph based data on the web as reported by Abrate’s et al. in, “The Clavius Correspondence: From Digitization to Visual Exploration of Knowledge”. In a related vein, Chessa et al. explore how to connect personal mobile devices to relevant services for data exploration inter alia in, “Service-oriented Mobile Social Networking”. Each of these contributions explores innovative approaches to better communicate history and heritage to visitors to museums, monuments or heritage sites around the globe.
The contributions to this special theme issue on digital humanities give an insight into some of the main currents of research currently being undertaken in DH. Such research is carried out as much within the context of smaller research projects as in European funded transnational structures such as the ERICs. The diversity of research demonstrated is strong evidence of the vitality of this interdisciplinary domain, where state-of-the-art digital technology goes hand in hand with the study of human culture of the present and of the past.
University of Florence, Italy