Publications

Automatic processing of emotion information through deep neural networks (DNN) can have great benefits for human-machine interaction. Vice versa, machine learning can profit from concepts known from human information processing (e.g., visual attention). We employed a recurrent DNN incorporating a spatial attention mechanism for facial emotion recognition (FER) and compared the output of the network with results from human experiments. The attention mechanism enabled the network to select relevant face regions to achieve state-of-the-art performance on a FER database containing images from realistic settings. A visual search strategy showing some similarities with human saccading behavior emerged when the model’s perceptive capabilities were restricted. However, the model then failed to form a useful scene representation.

We present a general framework for fusing pre-trained multisensor object detection networks for perception in autonomous cars at an intermediate stage using perspective invariant features. Key innovation is an autoencoder-inspired Transformer module which transforms perspective as well as feature activation layout from one sensor modality to another. Transformed feature maps can be combined with those of a modality-native object detector to enhance performance and reliability through a simple fusion scheme. Our approach is not limited to a specific object detection network architecture or even to specific sensor modalities. We show effectiveness of the proposed scheme through experiments on our own as well as on the KITTI dataset.

We present a novel architecture for intermediate fusion of Lidar and camera data for neural network-based object detection. Key component is a transformer module which learns a transformation of feature maps from one sensor space to another. This allows large parts of the multi-modal object detection network to be trained unimodally, reducing the required amount of costly multi-modal labeled data. We show effectiveness of the transformer as well as the proposed fusion scheme.

In addition to the prefrontal cortex (PFC), the basal ganglia (BG) have been increasingly often reported to play a fundamental role in category learning, but the systems-level circuits of how both interact remain to be explored. We developed a novel neuro-computational model of category learning that particularly addresses the BG-PFC interplay. We propose that the BG bias PFC activity by removing the inhibition of cortico-thalamo-cortical loop ahugo new –kind publication publication/nd thereby provide a teaching signal to guide the acquisition of category representations in the cortico-cortical associations to the PFC. Our model replicates key behavioral and physiological data of macaque monkey learning a prototype distortion task from Antzoulatos and Miller (2011). Our simulations allowed us to gain a deeper insight into the observed drop of category selectivity in striatal neurons seen in the experimental data and in the model. The simulation results and a new analysis of the experimental data, based on the model’s predictions, show that the drop in category selectivity of the striatum emerges as the variability of responses in the striatum rises when confronting the BG with an increasingly larger number of stimuli to be classified. The neuro-computational model therefore provides new testable insights of systems-level brain circuits involved in category learning which may also be generalized to better understand other cortico-basal ganglia-cortical loops.

Computer science offers a large set of tools for prototyping, writing, running, testing, validating, sharing and reproducing results; however, computational science lags behind. In the best case, authors may provide their source code as a compressed archive and they may feel confident their research is reproducible. But this is not exactly true. James Buckheit and David Donoho proposed more than two decades ago that an article about computational results is advertising, not scholarship. The actual scholarship is the full software environment, code, and data that produced the result. This implies new workflows, in particular in peer-reviews. Existing journals have been slow to adapt: source codes are rarely requested and are hardly ever actually executed to check that they produce the results advertised in the article. ReScience is a peer-reviewed journal that targets computational research and encourages the explicit replication of already published research, promoting new and open-source implementations in order to ensure that the original research can be replicated from its description. To achieve this goal, the whole publishing chain is radically different from other traditional scientific journals. ReScience resides on GitHub where each new implementation of a computational study is made available together with comments, explanations, and software tests.

Hippocampal place-cell sequences observed during awake immobility often represent previous experience, suggesting a role in memory processes. However, recent reports of goals being overrepresented in sequential activity suggest a role in short-term planning, although a detailed understanding of the origins of hippocampal sequential activity and of its functional role is still lacking. In particular, it is unknown which mechanism could support efficient planning by generating place-cell sequences biased toward known goal locations, in an adaptive and constructive fashion. To address these questions, we propose a model of spatial learning and sequence generation as interdependent processes, integrating cortical contextual coding, synaptic plasticity and neuromodulatory mechanisms into a map-based approach. Following goal learning, sequential activity emerges from continuous attractor network dynamics biased by goal memory inputs. We apply Bayesian decoding on the resulting spike trains, allowing a direct comparison with experimental data. Simulations show that this model (1) explains the generation of never-experienced sequence trajectories in familiar environments, without requiring virtual self-motion signals, (2) accounts for the bias in place-cell sequences toward goal locations, (3) highlights their utility in flexible route planning, and (4) provides specific testable predictions.

Recent advances in deep reinforcement learning methods have attracted a lot of attention, because of their ability to use raw signals such as video streams as inputs, instead of pre-processed state variables. However, the most popular methods (value-based methods, e.g. deep Q-networks) focus on discrete action spaces (e.g. the left/right buttons), while realistic robotic applications usually require a continuous action space (for example the joint space). Policy gradient methods, such as stochastic policy gradient or deep deterministic policy gradient, propose to overcome this problem by allowing continuous action spaces. Despite their promises, they suffer from long training times as they need huge numbers of interactions to converge. In this paper, we investigate in how far a recent asynchronously parallel actor-critic approach, initially proposed to speed up discrete RL algorithms, could be used for the continuous control of robotic arms. We demonstrate the capabilities of this end-to-end learning algorithm on a simulated 2 degrees-of-freedom robotic arm and discuss its applications to more realistic scenarios.

Neuro-computational models allow to study the brain mechanisms involved in intelligent behavior and extract essential computational principles which can be implemented in cognitive systems. They are a promising solution to achieve a brain-like artificial intelligence that can compete with natural intelligence on realistic behaviors. A crucial property of intelligent behavior is motivation, defined as the incentive to interact with the world in order to achieve specific goals, either extrinsic (obtaining rewards such as food or money, or avoiding pain) or intrinsic (satisfying one’s curiosity, fun). In the human brain, motivated or goal-directed behavior depends on a network of different structures, including the prefrontal cortex, the basal ganglia and the limbic system. Dopamine, a neuro-transmitter associated with reward processing, plays a central role in coordinating the activity of this network. It structures processing in high-level cognitive areas along a limbic-associative-motor gradient and impacts the learning capabilities of the whole system. In this habilitation thesis, I present biologically-constrained neuro-computational models which investigate the role of dopamine in visual object categorization and memory retrieval (Vitay and Hamker, 2008), reinforcement learning and action selection (Vitay and Hamker, 2010), the updating, learning and maintenance of working memory (Schroll, Vitay and Hamker, 2012) and timing processes (Vitay and Hamker, 2014). These models outline the many mechanisms by which the dopaminergic system regulates cognitive and emotional behavior: bistable processing modes in the cerebral cortex, modulation of synaptic transmission and plasticity, allocation of cognitive resources and signaling of relevant events. Finally, I present a neural simulator able to simulate a variety of neuro-computational models efficiently on parallel architectures (Vitay, Dinkelbach and Hamker, 2015).

A reference implementation of: Laje, R. and Buonomano, D.V. (2013). Robust timing and motor patterns by taming chaos in recurrent neural networks. Nat Neurosci. 16(7) pp 925-33 doi://10.1038/nn.3405

Many modern neural simulators focus on the simulation of networks of spiking neurons on parallel hardware. Another important framework in computational neuroscience, rate-coded neural networks, is mostly difficult or impossible to implement using these simulators. We present here the ANNarchy (Artificial Neural Networks architect) neural simulator, which allows to easily define and simulate rate-coded and spiking networks, as well as combinations of both. The interface in Python has been designed to be close to the PyNN interface, while the definition of neuron and synapse models can be specified using an equation-oriented mathematical description similar to the Brian neural simulator. This information is used to generate C++ code that will efficiently perform the simulation on the chosen parallel hardware (multi-core system or graphical processing unit). Several numerical methods are available to transform ordinary differential equations into an efficient C++code. We compare the parallel performance of the simulator to existing solutions.

Neural activity in dopaminergic areas such as the ventral tegmental area is influenced by timing processes, in particular by the temporal expectation of rewards during Pavlovian conditioning. Receipt of a reward at the expected time allows to compute reward-prediction errors which can drive learning in motor or cognitive structures. Reciprocally, dopamine plays an important role in the timing of external events. Several models of the dopaminergic system exist, but the substrate of temporal learning is rather unclear. In this article, we propose a neuro-computational model of the afferent network to the ventral tegmental area, including the lateral hypothalamus, the pedunculopontine nucleus, the amygdala, the ventromedial prefrontal cortex, the ventral basal ganglia (including the nucleus accumbens and the ventral pallidum), as well as the lateral habenula and the rostromedial tegmental nucleus. Based on a plausible connectivity and realistic learning rules, this neuro-computational model reproduces several experimental observations, such as the progressive cancelation of dopaminergic bursts at reward delivery, the appearance of bursts at the onset of reward-predicting cues or the influence of reward magnitude on activity in the amygdala and ventral tegmental area. While associative learning occurs primarily in the amygdala, learning of the temporal relationship between the cue and the associated reward is implemented as a dopamine-modulated coincidence detection mechanism in the nucleus accumbens.

In Parkinson’s disease, a loss of dopamine neurons causes severe motor impairments. These motor impairments have long been thought to result exclusively from immediate effects of dopamine loss on neuronal firing in basal ganglia, causing imbalances of basal ganglia pathways. However, motor impairments and pathway imbalances may also result from dysfunctional synaptic plasticity – a novel concept of how Parkinsonian symptoms evolve. Here we built a neuro-computational model that allows us to simulate the effects of dopamine loss on synaptic plasticity in basal ganglia. Our simulations confirm that dysfunctional synaptic plasticity can indeed explain the emergence of both motor impairments and pathway imbalances in Parkinson’s disease, thus corroborating the novel concept. By predicting that dysfunctional plasticity results not only in reduced activation of desired responses, but also in their active inhibition, our simulations provide novel testable predictions. When simulating dopamine replacement therapy (which is a standard treatment in clinical practice), we observe a new balance of pathway outputs, rather than a simple restoration of non-Parkinsonian states. In addition, high doses of replacement are shown to result in overshooting motor activity, in line with empirical evidence. Finally, our simulations provide an explanation for the intensely debated paradox that focused basal ganglia lesions alleviate Parkinsonian symptoms, but do not impair performance in healthy animals. Overall, our simulations suggest that the effects of dopamine loss on synaptic plasticity play an essential role in the development of Parkinsonian symptoms, thus arguing for a re-conceptualisation of Parkinsonian pathophysiology.

Modern parallel hardware such as multi-core processors (CPUs) and graphics processing units (GPUs) have a high computational power which can be greatly beneficial to the simulation of large-scale neural networks. Over the past years, a number of efforts have focused on developing parallel algorithms and simulators best suited for the simulation of spiking neural models. In this article, we aim at investigating the advantages and drawbacks of the CPU and GPU parallelization of mean-firing rate neurons, widely used in systems-level computational neuroscience. By comparing OpenMP, CUDA and OpenCL implementations towards a serial CPU implementation, we show that GPUs are better suited than CPUs for the simulation of very large networks, but that smaller networks would benefit more from an OpenMP implementation. As this performance strongly depends on data organization, we analyze the impact of various factors such as data structure, memory alignment and floating precision. We then discuss the suitability of the different hardware depending on the networks’ size and connectivity, as random or sparse connectivities in mean-firing rate networks tend to break parallel performance on GPUs due to the violation of coalescence.

Cortico-basalganglio-thalamic loops are involved in both cognitive processes and motor control. We present a biologically meaningful computational model of how these loops contribute to the organization of working memory and the development of response behavior. Via reinforcement learning in basal ganglia, the model develops flexible control of working memory within prefrontal loops and achieves selection of appropriate responses based on working memory content and visual stimulation within a motor loop. We show that both working memory control and response selection can evolve within parallel and interacting cortico-basalganglio-thalamic loops by Hebbian and three-factor learning rules. Furthermore, the model gives a coherent explanation for how complex strategies of working memory control and response selection can derive from basic cognitive operations that can be learned via trial and error.

While classical theories systematically opposed emotion and cognition, suggesting that emotions perturbed the normal functioning of the rational thought, recent progress in neuroscience highlights on the contrary that emotional processes are at the core of cognitive processes, directing attention to emotionally-relevant stimuli, favoring the memorization of external events, valuating the association between an action and its consequences, biasing decision making by allowing to compare the motivational value of different goals and, more generally, guiding behavior towards fulfilling the needs of the organism. This article first proposes an overview of the brain areas involved in the emotional modulation of behavior and suggests a functional architecture allowing to perform efficient decision making. It then reviews a series of biologically-inspired computational models of emotion dealing with behavioral tasks like classical conditioning and decision making, which highlight the computational mechanisms involved in emotional behavior. It underlines the importance of embodied cognition in artificial intelligence, as emotional processing is at the core of the cognitive computations deciding which behavior is more appropriate for the agent.

Visual working memory (WM) tasks involve a network of cortical areas such as inferotemporal, medial temporal and prefrontal cortices. We suggest here to investigate the role of the basal ganglia (BG) in the learning of delayed rewarded tasks through the selective gating of thalamocortical loops. We designed a computational model of the visual loop linking the perirhinal cortex, the BG and the thalamus, biased by sustained representations in prefrontal cortex. This model learns concurrently different delayed rewarded tasks that require to maintain a visual cue and to associate it to itself or to another visual object to obtain reward. The retrieval of visual information is achieved through thalamic stimulation of the perirhinal cortex. The input structure of the BG, the striatum, learns to represent visual information based on its association to reward, while the output structure, the substantia nigra pars reticulata, learns to link striatal representations to the disinhibition of the correct thalamocortical loop. In parallel, a dopaminergic cell learns to associate striatal representations to reward and modulates learning of connections within the BG. The model provides testable predictions about the behavior of several areas during such tasks, while providing a new functional organization of learning within the BG, putting emphasis on the learning of the striatonigral connections as well as the lateral connections within the substantia nigra pars reticulata. It suggests that the learning of visual WM tasks is achieved rapidly in the BG and used as a teacher for feedback connections from prefrontal cortex to posterior cortices.

This review focuses on biological issues of reinforcement learning. Since the influential discovery of W. Schultz of an analogy between the reward prediction error signal of the temporal difference algorithm and the firing pattern of some dopaminergic neurons in the midbrain during classical conditioning, biological models have emerged that use computational reinforcement learning concepts to explain adaptative behavior. In particular, the basal ganglia has been proposed to implement among other things reinforcement learning for action selection, motor control or working memory. We discuss to which extent the analogy between the temporal difference algorithm and the firing of dopamine cells can be considered as valid. Our review then focuses on the basal ganglia, their anatomy and key computational properties as demonstrated by three recent, influential models.

The perirhinal cortex is involved not only in object recognition and novelty detection but also in multimodal integration, reward association, and visual working memory. We propose a computational model that focuses on the role of the perirhinal cortex in working memory, particularly with respect to sustained activities and memory retrieval. This model describes how different partial informations are integrated into assemblies of neurons that represent the identity of an object. Through dopaminergic modulation, the resulting clusters can retrieve the global information with recurrent interactions between neurons. Dopamine leads to sustained activities after stimulus disappearance that form the basis of the involvement of the perirhinal cortex in visual working memory processes. The information carried by a cluster can also be retrieved by a partial thalamic or prefrontal stimulation. Thus, we suggest that areas involved in planning and memory coordination encode a pointer to access the detailed information encoded in the associative cortex such as the perirhinal cortex.

Although dopamine is one of the most studied neurotransmitter in the brain, its exact function is still unclear. This short review focuses on its role in different levels of cognitive vision: visual processing, visual attention and working memory. Dopamine can influence cognitive vision either through direct modulation of visual cells or through gating of basal ganglia functioning. Even if its classically assigned role is to signal reward prediction error, we review evidence that dopamine is also involved in novelty detection and attention shifting and discuss the possible implications for computational modeling.

This thesis ascribes in the field of computational neuroscience whose goal is to modelize complex cognitive functions by means of numerical computer simulations while getting inpiration from cerebral functioning. Contrary to a top-down approach necessiting to know an analytic expression of the function to be simulized, the chosen bottom-up approach allows to observe the emergence of a function thanks to the interaction of artificial neural populations without any prior knowledge. We first present a particular neural network type, neural fields, whose properties of robustness to noise and spatio-temporal continuity allow that emergence. In order to guide the emergence of sensorimotor transformations onto this substrate, we then present the architecture of the visual and motor systems to highlight the central role of visual attention in the realization of these functions by the brain. We then propose a functional diagram of sensorimotor transformations where the preparation of an ocular saccade guides attention towards a region of visual space and allow movement preparation. We last describe a computational model of attentional spotlight displacement that, by using a dynamical spatial working memory, allows sequential search of a target in a visual scene thanks to the phenomenom of inhibition of return. The performances of this model (robustness to noise, to object movement and to saccade execution) are analysed in simulation and on a robotic platform.

Some visual search tasks require to memorize the location of stimuli that have been previously scanned. Considerations about the eye movements raise the question of how we are able to maintain a coherent memory, despite the frequent drastically changes in the perception. In this article, we present a computational model that is able to anticipate the consequences of the eye movements on the visual perception in order to update a spatial memory.

We present a dynamic model of attention based on the Continuum Neural Field Theory that explains attention as being an emergent property of a neural population. This model is experimentally proved to be very robust and able to track one static or moving target in the presence of very strong noise or in the presence of a lot of distractors, even more salient than the target. This attentional property is not restricted to the visual case and can be considered as a generic attentional process of any spatio-temporal continuous input.

Although biomimetic autonomous robotics relies on the massively parallel architecture of the brain, the key issue is to temporally organize behaviour. The distributed representation of the sensory information has to be coherently processed to generate relevant actions. In the visual domain, we propose here a model of visual exploration of a scene by the means of localized computations in neural populations whose architecture allows the emergence of a coherent behaviour of sequential scanning of salient stimuli. It has been implemented on a real robotic platform exploring a moving and noisy scene including several identical targets.