by Michail Maniadakis and Panos Trahanias
Self-referential cognitive control is a fundamental capacity of animals and humans. Ongoing research in FORTH-ICS focuses on implementing this high-order cognitive skill in the domain of artificial autonomous agents, aiming to accomplish an important milestone for the development of the seamless integration of robots into human societies.
The long-term goal of human-robot symbiosis requires equipping artificial agents with the capacity to autonomously control their thought and behaviour. The meta-level mental processes which are responsible for controlling cognitive activities are referred to as executive control functions. Research in FORTH-ICS focuses on implementing such meta-level processes in artificial agents, investigating the dynamics of their interaction with the cognitive activities under control. Interestingly, besides the high potential of implementing cognitive systems with executive control capacity in developing truly autonomous and intelligent robots, such systems may formulate novel explanations on the working principles of the human brain.
A well-known experiment used for investigating executive control functions in the human brain is the Wisconsin Card Sorting Test (WCST), where a subject is asked to discover and apply a card sorting rule based on reward and punishment feedback. In unpredictable times during the task, the rule is changed by the experimenter and must be re-discovered by the subject. The ordinary WCST can be further enriched with the option of betting on behavioral outcomes (ie, success or failure of sorting), testing the capacity of subjects to monitor and implement confidence about the currently adopted rule.
Figure 1: Phase plots of the first two principal components of neural activity for the Type-A (left) and Type-B (right) CTRNNs.
We have designed a mobile-robot task that resembles WCST-with-Betting task, investigating rule switching in a sample-response paradigm. The agent has to learn three sample-response rules, selecting, applying and re-selecting each one of them, as indicated by reward and punishment signals provided by the experimenter. The task is based on three response rules named Same Side (SS), Opposite Side (OS) and No Response (NR), guiding robot behaviour in a T-shaped environment. According to the SS rule, the agent must navigate towards the left wing if the light source appeared at its left side, and it must navigate towards the right wing if the light source appeared at its right side. According to the OS rule, the robot has to turn to the opposite direction of the light side, i.e. right when light appears to the left, and left when light appears to the right. In the case of the NR rule, the robot should ignore the side of light staying close to the starting position. The rule following and rule switching capacity of the robot is evaluated for a large sequence of trials examining all possible combinations. At the beginning of trials, the agent bets for the success of the forthcoming response, having the opportunity to gain some profit. Overall, the underlying task requires the coordination of a range of different cognitive skills that include generating motor commands that efficiently drive the robot, maintaining working memory for the currently followed rule, examining conflicts between the adopted rule and the reward or punishment feedback, self-monitoring for confidence development and betting decisions on the basis of the selected rule.
The exploration of executive control mechanisms implemented into a two-level Continuous Time Recurrent Neural Network (CTRNN) is based on an evolutionary approach. We have run several statistically independent evolutionary processes that revealed two basic mechanisms for the solution of the underlying problem, named Type-A and Type-B. This is illustrated in the Figure 1, which shows the phase plots of the first two principal components of CTRNN activity (each rule is presented with a different color).
In the case of Type-A, there is a partial overlap between the trajectories encoding rules SS and OS (i.e. trajectories shown in red and green) while NR is represented by a distinct attractor (i.e. blue trajectory). The overlap of SS and OS suggests their encoding as subclusters of a larger cluster separating them from NR. This is a reasonable organization since both SS and OS ask the agent to navigate in the environment, while NR asks the agent to ignore cue stimulus and stay close to the starting position. On the contrary, the plot corresponding to Type-B solution shows three attractors akin to three different fixed points. This corresponds to clearly distinct representations for the rules SS, OS and NR. Such a rule encoding is also reasonable, given that the three rules are actually independent from one another. Further investigation of the executive control mechanisms self-organized in the CTRNN revealed that:
- Rule switching is achieved through the destabilization of attractor-following neurodynamics that is caused by the lack of positive rewards, therefore making neural activity jump on a different attractor (representing another rule).
- Betting strategy is affected by a self monitoring procedure making the agent place high bets for the more stable rules.
The obtained results have additionally shown that the implementation of rule switching and betting mechanisms is affected by the characteristics of the rule encoding scheme (i.e. either Type-A or Type-B). Interpreting this observation into the framework of biological cognition, we may say that prior experience, and the way a task is understood by a subject is likely to affect the development of the relevant dynamics in his brain. In other words, if two subjects understand a given problem in different ways, then they may use their brains in different ways when solving the problem. Such a subjective view on cognition is particularly relevant with high level skills because they are not directly linked with the phylogenetically strict characteristics of the low-level sensory-motor system.
Ongoing research activities in FORTH-ICS capitalize on the above mentioned results, investigating issues which are based on executive control, such as risk undertaking by artificial agents.
Link:
http://www.ics.forth.gr/cvrl
Please contact:
Michail Maniadakis
Institute of Computer Science (ICS), FORTH, Greece
Tel:+302810391701
E-mail:
Panos Trahanias
Institute of Computer Science (ICS), FORTH, Greece
Tel:+302810391715
E-mail: