by Frederic Alexandre (Inria)
Inspiration from human learning sets the focus on one essential but poorly studied characteristic of learning: Autonomy.
One remarkable characteristic of human learning is that, although we may not excel in any specific domain, we are quite good in most of them, and able to adapt when a new problem appears. We are versatile and adaptable, which are critical properties for autonomous learning: we can learn in a changing and uncertain world. With neither explicit labels, nor data preprocessing or segmentation, we are able to pay attention to important information and neglect noise. We define by ourselves our goals and the means to reach them, self-evaluate our performances and apply previously learned knowledge and strategies in different contexts. In contrast, recent advances in machine learning exhibit impressive results, with powerful algorithms surpassing human performance in some very specific domains of expertise, but these models still have very poor autonomy.
Our Mnemosyne Inria project-team is working in the Bordeaux Neurocampus with medical and neuroscientist teams to develop systemic models in computational neuroscience, focusing on these original characteristics of human learning. Our primary goal is to develop models of the different kinds of memory in the brain and of their interactions, with the objective to exploit them to study neurodegenerative diseases, and another important outcome of our work is to propose original models in machine learning, integrating some of these important characteristics.
Figure: Example of a large scale model described in , studying the interactions between two systems in the brain, respectively influenced by noradrenaline (NE, released by Locus Coeruleus) and dopamine (DA, released by VTA) and involving different regions of the loops between the prefrontal cortex (ACC, OFC) and the basal ganglia (Ventral Striatum, GPi). The NE system evaluates the level of non-stationarity of sensory input and modifies accordingly the level of attention on sensory cues, resulting in a shift between exploitation of previously learned rules and exploration of new rules in the DA system performing action selection.
We believe that important steps toward autonomous learning can be made along the following lines of research:
Developing an interacting system of memories
Specific circuits in the brain are mobilised to learn explicit knowledge and others to learn procedures. In addition to modelling these circuits, studying their interactions is crucial to understanding how one system can supervise another, resulting in a more autonomous way of learning. In the domain of perceptual learning in the medial temporal lobe, we model episodic memories storing important events in one trial, and forming later, by consolidation in other circuits, new semantic categories. In the domain of decision-making in the loops between the prefrontal cortex and the basal ganglia, we model cerebral mechanisms by which goal-directed behaviour relying on explicit evaluation of expected rewards can later become habits, automatically triggered with less flexibility but increased effectiveness.
Coping with uncertainty
We learn the rules that govern the world and consider it uncertain for two main reasons: it can be predictable up to a certain level (stochastic rules) or non-stationary (changing rules). Whereas standard probabilistic models are rather good at tackling the first kind of uncertainty, non-stationarity in a dynamic world raises more difficult problems. We are studying how regions of the medial prefrontal cortex detect and evaluate the kind and the level of uncertainty by monitoring recent history of performance at managing correctly incoming events. These regions are also reported to activate the release of neuromodulators like monoamines, known to play a central role in adaptation to uncertainties . In a nutshell, instead of developing large sets of circuits to manage uncertainty as stable rules in various contexts, the cerebral system has developed a general-purpose system adaptable to uncertainty with hyperparameters sensitive to meta-learning by neuromodulation, which is what we are currently trying to understand more precisely.
Embodiment for emotional learning.
One important source of autonomy is our body itself that tells us what is good or bad for us; what must be sought out or avoided. Pavlovian learning is modelled to detect and learn to predict biologically-significant aversive and appetitive (emotional) stimuli which are key targets for attentional processing and for the organisation of behaviour. This learning can be done autonomously if the model of the cerebral system is associated with a substrate corresponding to the body, including sensors for pain and pleasure. We take this a step further, extending the study of the Pavlovian rules to integrate the effects of Pavlovian responses on the body and the neuromodulatory system.
From motivation to self-evaluation
Considering the brain and the body also introduces physiological needs, fundamental to introducing internal goals in addition to the external goals evoked above. This is the basis for renewed approaches regarding reinforcement learning, defining criteria more complex than a simple scalar representing an abstract reward. In humans, another important source of information for learning autonomously is based on self-evaluation of performance. It is noticeable that both motivation and self-evaluation processing are central in cognitive control  and reported to be located in the anterior part of the prefrontal cortex, as we endeavour to integrate in our models.
In addition to developing models to explore each of these mechanisms in interaction with neuroscience and medicine, we also integrate them in a common platform defining the adaptive characteristics of an autonomous agent exploring an unknown virtual world together with the characteristics of its artificial body. Beyond machine learning, this numerical testbed is also a valuable simulation tool for our medical and neuroscientist colleagues.
 A.J. Yu, P. Dayan; “Uncertainty, Neuromodulation and Attention”, Neuron 46(4), 2005
 M. Carrere, F. Alexandre: “Modeling the sensory roles of noradrenaline in action selection”, the Sixth Joint IEEE International Conference Developmental Learning and Epigenetic Robotics, 2016
 E. Koechlin, C. Ody, F. Kouneiher: “The Architecture of Cognitive Control in the Human Prefrontal Cortex”, Science, 302(5648):1181–1185, 2003.
Inria Bordeaux Sud-Ouest, France