by Dimitra Anastasiou and Eric Ras (Luxembourg Institute of Science and Technology)
2D gestures have been extensively examined on surface and interactive tabletop computing, both in the context of training of predefined datasets and “in the wild”. The same cannot be said for 3D gestures, however. The current literature does not address what is happening above the tabletop interfaces, nor does it address the semantics.
Ullmer and Ishii describe tangible user interfaces as giving “physical form to digital information, employing physical artifacts both as ‘representations’ and ‘controls’ for computational media” . The “Gestures in Tangible User Interfaces” (GETUI) project [L1], coupled 2D and 3D gesture analysis with tangible user interfaces (TUIs) with the aim of achieving technology-based assessment of collaborative problem-solving.
We ran exploratory user studies at two secondary schools in Luxembourg and one in Belgium with 66 pupils. We used the multiuser interactive display MultiTaction MT550 and a Kinect v.2.0 depth sense camera to record 3D data relating to the participants. The participants’ task, presented as a microworld scenario on the MultiTaction device, was to build a power grid by placing and rotating tangible objects or widgets, which represented industrial facilities (coal-fired power plants, wind parks, and solar parks) that produce electricity (Figure 1). The design and development of the microworld scenario was done through the COllaborative Problem Solving Environment (COPSE), which is a novel and unique software framework for instantiating microworlds as collaborative problem-solving activities on tangible tabletop interfaces .
Figure 1: Microworld scenario on the TUI.
We used the Kinect camera to explore the behaviour of the participants during the collaborative problem solving. The main problem we experienced with Kinect was user identification. As is common in multi-user environments, users moved frequently in order to explore different parameters on the TUI, with the result that their initial IDs were lost or exchanged, leading to misinterpretation of the logging data. Also, the lightening and position of the Kinect had to be selected carefully. An additional technical limitation is recognition of finger-based gestures, including emblems (substitutes for words) and adaptors (gestures without conscious awareness used to manage our feelings). Another research-related drawback is the definition of a gesture and particularly of cooperative gestures. When exactly does a gesture start and when does it end? A gesture usually passes through up to five phases: preparation, prestroke hold, the stroke itself, poststroke hold, and retraction . In our TUI setting, the most prominent gesture type is pointing. What if, during the poststroke hold of a user A, user B is in the preparation phase of her own gesture? We consider cooperative gesture as a gesture sequence when two or more gestures, the first of which is always pointing, are performed simultaneously or consecutively by multiple users (not by the same user). However, the impact of the cooperative gesture has to be annotated manually, since it can be positive, negative, or none.
The 2D gestures, i.e., gestures performed on the tabletop, are logged by the COPSE software. We decided to develop an application to link the COPSE with Kinect, so that all performed gestures, both 2D and 3D are logged. Therefore, our application consists of two components: i) the Client Reader (COPSE), which reads and transfers information about the object ID and the TUI’s coordinates of the objects, and ii) the Body Reader, which receives these coordinates and converts them into Kinect coordinates by using a transformation matrix (see Figure 2). This matrix is created by the calibration procedure, where the TUI location and its plane are transformed into the Kinect coordinates system. More information is available in a technical report [L2].
Figure 2: Application linking the recognition of 2D and 3D gestures.
The analysis and evaluation of gestures is important both economically and socially, with many fields using gesture as input or output in their applications for instance, telecommunications, entertainment, and healthcare. Nonverbal behaviour is of particular importance in collaborative and virtual environments. Until now, few studies have addressed the nonverbal cues people display in collaborative virtual environments.
With the GETUI project we examined correlations between 3D gestures and collaborative problem-solving performance using a TUI. We compared two groups (high-achievers vs. low-achievers) and found that the pointing gestures were almost equal among the two groups (M = 25.7 for low-achievers and M = 25.6 for high-achievers), while the adaptors (head/mouth scratching, nail biting) were used slightly more frequently by the low-achievers, whereas the emblems (thumbs up, victory sign) were used largely by high-achievers. We addressed and measured collaboration as one of the transversal skills of the 21st Century. TUIs and the visualisation of microworld scenarios can be used both for formal school education and assessment as well as for vocational training and modern workplace learning. Today, in group settings, it is not only the group problem-solving performance that matters, but also soft skills, which include personal, social, and methodical competences. In the future, we plan to apply the assets of GETUI to collaborative virtual environments, in order to create and assess avatars’ nonverbal behaviour.
 B. Ullmer, H. Ishii, H.: “Emerging Frameworks for Tangible User Interfaces”, IBM Syst. J. 39, 915-931, 2000
 V. Maquil, et al.: “COPSE: Rapidly instantiating problem solving activities based on tangible tabletop interfaces”, in Proc. of the ACM on Human-Computer Interaction 1(1), 6, 2017
 D. McNeill: “Hand and mind: What gestures reveal about thought” Chicago: University of Chicago Press, 1992.
Luxembourg Institute of Science and Technology, Luxembourg