Implémentation et Évaluation d’un Modèle D’Attention pour la Vision Adaptative

Matthieu Perreira Da Silva Vincent Courboulay 

IRCCyN, Université de Nantes Rue Christian Pauc - BP 50609, F-44306 Nantes cedex 03

L3I, Université de La Rochelle Avenue M. Crépeau, F-17042 La Rochelle cedex 01

In the field of scene analysis for computer vision, a trade-off must be found between the quality of the results expected, and the amount of computer resources allocated for each task. Using an adaptive vision system provides a more flexible solution as its analysis strategy can be changed according to the information available concerning the execution context. We describe how to create and evaluate a visual attention system tailored for interacting with a computer vision system so that it adapts its processing according to the interest (or salience) of each element of the scene. We propose a new set of constraints named PAIRED to evaluate the adequacy of a model with respect to its different applications. We justify why dynamical systems provide good properties for simulating the dynamic competition between different kinds of information. We present different results that demonstrate that our results are fast and highly configurable and plausible.

Extended Abstract

While machine vision systems are becoming increasingly powerful, in most regards they are still far inferior to their biological counterparts. In human, the mechanisms of evolution have generated the visual attention system which selects the most important information in order to reduce both cognitive load and scene understanding ambiguity. Thus, studying the biological systems and applying the findings to the construction of computational vision models and artificial vision systems are a promising way of advancing the field of machine vision.

In the field of scene analysis for computer vision, a trade-off must be found between the quality of the results expected, and the amount of computer resources allocated for each task. It is usually a design time decision, implemented through the choice of pre-defined algorithms and parameters. However, this way of doing it limits the generality of the system. Using an adaptive vision system provides a more flexible solution as its analysis strategy can be changed according to the information available concerning the execution context. As a consequence, such a system requires some kind of guiding mechanism to explore the scene faster and more efficiently.

In this article, we propose a first step to building a bridge between computer vision algorithms and visual attention. In particular, we describe how to create and evaluate a visual attention system tailored for interacting with a computer vision system so that it adapts its processing according to the interest (or salience) of each element of the scene. Somewhere in between hierarchical salience based and competitive distributed models, we propose a hierarchical yet competitive model. Our original approach allows us to generate the evolution of attentional focus points without the need of either saliency map or explicit inhibition of return mechanism. This new real-time computational model is based on a dynamical system. The use of such a complex system is justified by an adjustable trade-off between nondeterministic attentional behavior and properties of stability, reproducibility and reactiveness.

In the first two sections, we start by giving a brief overview of the main theories and concepts of human visual attention and we provide the forces and weaknesses of state of the art attention models. This analysis is based on their potential of integration into adaptable computer vision system. We propose a new set of constraints called ‘PAIRED’ to evaluate the adequacy of a model with respect to its different applications.

In a third section, we provide an in-depth description of our model and its implementation. We justify why dynamical systems are a good choice for visual attention simulation, and we show that preys/predators models provide good properties for simulating the dynamic competition between different kinds of information. This dynamical system is also used to generate a focus point at each time step of the simulation. In order to show that our model can be integrated in an adaptable computer vision system, we show that this architecture is fast and allows a flexible real time visual attention simulation. In particular, we present a feedback mechanism used to change the scene exploration behavior of the model. This mechanism can be used to maximize the scene coverage (explore each and every part) or maximize focalization on a particular salient area (tracking).

In a last section we present the evaluation results of our model. Since the model is highly configurable, its evaluation will not cover not its plausibility compared to human eye fixations (already studied in (Perreira Da Silva et al., 2011)), but the influence of each parameter on a set of properties:

– stability: do the values of the dynamical system stay within their nominal range when the different parameters of the model are changed?

– reproducibility: as discrete dynamical system can have a chaotic behavior, what is the influence of the various parameters of the model (in particular, noise) on the variability of the focus paths generated during different simulations on the same data?

– scene exploration: which parameters influence the scene exploration strategy of our model?

– system dynamics: how can we influence the reactivity of the system? In particular how do we deal with mean fixation time?

For all of these properties we have also studied the influence of top-down feedback.


Dans le domaine de l’analyse de scène en vision par ordinateur, un compromis doit être trouvé entre la qualité des résultats attendus et les ressources allouées pour effectuer les traitements. Une solution flexible consiste à utiliser un système de vision adaptatif capable de moduler sa stratégie d’analyse en fonction de l’information disponible et du contexte. Dans cet article, nous décrivons comment concevoir et évaluer un système d’attention visuelle conçu pour interagir avec un système de vision de façon à ce que ce dernier adapte ses traitements en fonction de l’intérêt (de la saillance) de chaque élément de la scène. Nous proposons également un nouvel ensemble de contraintes nommé PAIRED, permettant d’évaluer l’adéquation du modèle à différentes applications. Nous justifions le choix des systèmes dynamiques par leurs propriétés intéressantes pour simuler la compétition entre différentes sources d’informations. Nous présentons enfin une validation à travers différentes métriques montrant que nos résultats sont rapides, hautement configurables et pertinents.


attention model, dynamical model, adaptive vision, implementation, evaluation.


modèle dynamique d’attention, vision adaptative, implémentation, évaluation.

1. Introduction
2. Modèles Computationnels d’Attention Visuelle
3. Un Modèle d’Attention Visuelle Hiérarchique Compétitif
4. Évaluation du Modèle
5. Conclusion

