by José Rouillard, Jean-Claude Tarby, Xavier Le Pallec and Raphaël Marvie
With the MINY (Multimodality Is Nice for You!) project, our goal is to propose some novel possibilities to take many modalities of interaction into account. Using a model-driven engineering approach we present some suggestions in order to tackle the challenges around the design of intelligent and multimodal cognitive systems.
Technology's evolution is an unstoppable process. Consider the regular release of new devices such as smart-phones or the multi-touch tabletop: each new version is more powerful and more interconnected than the previous one. Home automation is an example of improved communication: Washing machines can run Android and be remotely driven by a smartphone or computer. While such interaction is easy to implement, most of these systems offer a single modality of interaction: the Wii only support movement based interactions. Games are at the more complex end of the scale, relying on two modalities such as Mouse / Keyboard and voice.
Multimodality is the ability to combine different modalities of interaction (voice, gesture, touch, etc) as input and/or output, such as the historical "Put that there" from Bolt in 1980. Our goal is the design and implementation of intelligent multimodal systems. By intelligent, we mean the ability to make decisions, to request additional information from outside (eg the user or other applications), and to learn (from mistakes, from the user actions, etc).
Our approach is top-down and is very concerned about the heterogeneity of the material covered. It begins with the specification of tasks that the system can achieve. Then we choose the best suited materials to enable the realization of these tasks. This allows code generation supporting the interaction with the system and associated materials (eg X10 home automation, but also smartphones, webcams etc).
To support this top-down approach, we use a model-driven engineering approach (MDE). A first reason for this choice is that we are designing applications for various domains (such as home automation, botany, tourism). While the modalities are always the same, their implementations change from one application to another.
A second reason is that our work addresses both the design and the execution. At the design level, systems should not only be designed by computer experts but also by domain experts such as a botanist. During the execution, end users should be able to adapt their applications to their context of use (and its constraints). A tactile modality, for instance, is not practical when the temperature requires wearing gloves. In this situation one should be able to switch to a vocal modality on the fly.
The MDE approach provides us with interesting tools. Meta-modelling and separation of concerns (or aspects) help support multiple domains while capitalizing recurrent aspects. Model transformation and code generation ease the support of similar modalities for different domains while providing a tailored implementation every time.
Figure 1: Models of application domains are integrated in an interactions model template. A compass, for example, can be used in different manners (input/output) according to a particular context.
We face three major challenges in our research:
1. The modelling of modalities.
This must be done at an abstract level in order to facilitate modelling in different domains. For example, a compass in a smartphone can be seen as many kinds of information sources. As we can see in Figure 1, it can be used:
- as an output, to give a direction to the user
- as an input selector among four positions (North, South, East, West)
- as a chronological switcher (past, present and future, according to movement toward a particular direction)
- as a digital switch (for instance according to a financial budget, from the cheapest to the most expensive, user will visualize the lining of a roof with slate, tile, etc.)
- as a metal detector.
The difficulty lies in modelling these possibilities of interaction, without any application domain in mind.
2. Managing the heterogeneity of the devices and components that can be used in a system.
In the context of MDE, our main proposition is to design systems in an abstract and generic way in order to allow the generation of code (with multimodal capabilities, for instance) according to a particular context. To do so, we use the concept of "model template" in order to dynamically generate some suitable scripts and code software.
3. The intelligence of the system.
In the context of ambient intelligence, for example, it's not easy for supposed cognitive and intelligent systems to detect particular situations or users' behaviours (danger, help or more information needed, etc.). Ideally the system would understand some specific situations, learn during interactions, and offer appropriate suggestions, by making inferences based on elements within the operating context.
Our future work will focus on three issues:
- multimodality composition: How to allow the usage of "synergical" multimodality rather than only "alternate" multimodality ?
- MDE: How to use generic models (template models) in order to support our approach?
- the intelligence and cognitive abilities of the systems.
Since the beginning of the project, we have been using workflow mechanisms for the reasoning base. Currently, we are experimenting with multi-agent systems in order to provide the user with information and decisions based on more flexible, intelligent and autonomous behaviours.
Laboratoire d'Informatique Fondamentale de Lille (LIFL), University of Lille, France
Tel: +33 3 20 33 59 37