by Pierre-Yves Oudeyer, Manuel Lopes (Inria), Celeste Kidd (Univ. of Rochester) and Jacqueline Gottlieb (Univ. of Columbia)
Autonomous lifelong multitask learning is a grand challenge of artificial intelligence and robotics. Recent interdisciplinary research has been investigating a key ingredient to reach this goal: curiosity-driven exploration and intrinsic motivation.
A major difference between human learning and most current machine learning systems is that humans are capable of autonomously learning an open-ended repertoire of skills, often from very little data that they actively collect themselves. Humans show an extraordinary capacity to adapt incrementally to new situations and new tasks. They proactively seek, select, and explore new information to develop skills before they are actually needed.
On the contrary, typical machine learning systems—including those associated with recent advances in deep (reinforcement) learning—learn to solve finite sets of tasks that are predefined by the engineer, and only by access to very large databases of examples. As a consequence, such machines require a new dedicated reward/cost function to be programmed by an engineer and time to reprocess millions of learning examples for every new task the machines are given.
One of the major components that enables autonomous, open learning in humans is curiosity, a form of intrinsic motivation that pushes us to actively seek out information and practice new skills for the mere pleasure of learning and mastering them (as opposed to practicing them for extrinsic rewards such as money or social recognition). In the context of the interdisciplinary HFSP project “Curiosity”, Flowers team [L1] at Inria (France), Gottlieb Lab [L2] at Columbia University (US) and Kidd Lab [L2] at University of Rochester (US) are joining forces to study the mechanisms of curiosity-driven active learning in children, adults and monkeys and how they can be modelled and applied with machine learning systems. Mixing artificial intelligence, machine learning, psychology and neuroscience, this project aims at pushing the frontiers of what we know about human active learning and how it can be built into machines.
Figure 1: Curiosity-driven learning in humans and robots (left: photo by Adam Fenster/Univ. Rochester; right: Milo Keller/ECAL).
Various strands of work in developmental robotics, AI and machine learning have begun to explore formal models of curiosity and intrinsic motivation (see  for a review), providing theoretical tools used in this project. In these models, curiosity is typically operationalised as a mechanism that selects which action to experiment or which (sub-)goals to pursue, based on various information-theoretic measures of their “interestingness”. Many such measures have already been studied with machines and robots—e.g., Bayesian surprise, uncertainty, information gain, learning progress or empowerment—and are often optimised within the reinforcement learning framework, where they are used as intrinsic rewards.
Such algorithmic systems were recently shown to allow machines to learn how to solve efficiently difficult tasks in which extrinsic rewards are rare or deceptive, precluding an easy solution through traditional reinforcement learning methods . These systems were shown to allow robots to efficiently learn multiple fields of parameterised high-dimensional continuous action policies . They also allow robots to self-organise their own learning curriculum, self-generating and self-selecting their own goals, showing a progressive development of new skills with stages that reproduce fundamental properties of human development, for example, in vocal development or tool use .
However, many open questions remain. For example, what are the features of interestingness that stimulate the curiosity of human brains? Can current computational models account for them, or be improved by taking inspiration from the heuristics used by humans? Are these mechanisms of curiosity hardwired or adapted during lifelong learning? As curiosity is a form of guidance for exploration and data collection for autonomous machines, it is also possible to investigate how it can be combined with other forms of guidance used by human-like imitation. For example, in recent robotics experiments, curiosity-driven robots learn repertoires of skills by actively seeking help from human teachers .
Finally, curiosity has long been known to be key in fostering efficient education. Computational models of these mechanisms open the possibility for new kinds of educational technologies that could foster intrinsically motivated learning. In recent work, Clement et al.  showed one way by presenting active teaching algorithms that were capable of personalising sequences of pedagogical exercises (e.g., math exercises for primary school children), through the dynamic selection of exercises that maximise informational quantities such as learning progress.
 P-Y. Oudeyer, J. Gottlieb, M. Lopes: “Intrinsic motivation, curiosity and learning: theory and applications in educational technologies”, Progress in Brain Research, 2016.
 P-Y. Oudeyer, L. Smith: “How Evolution may work through Curiosity-driven Developmental Process”, Topics in Cognitive Science, 1-11, 2016.
 B. Clement, D. Roy, P-Y. Oudeyer, M. Lopes: “Multi-Armed Bandits for Intelligent Tutoring Systems”, Journal of Educational Data Mining (JEDM), Vol 7, No 2, 2015.
Inria and Ensta ParisTech