Abstracts

A Value-Driven Mechanism of Attentional Selection

Brian A. Anderson & Steven Yantis, Johns Hopkins University

Whether a stimulus is attended has long been thought to be a function of its physical salience (leading to stimulus-driven attentional capture) and its relation to ongoing visual search goals (leading to goal-directed deployment of attention). We will present evidence that learning to associate different stimuli with different amounts of reward persistently biases their attentional priority. These value-driven changes in attentional priority cause reward-associated stimuli to compete effectively for attention even when they are not salient and are task-irrelevant. The effects of reward learning on attentional priority can generalize to novel stimuli that express a reward-associated feature, and these effects can persist for long periods of time without further learning. We argue for a mechanism of attentional control that is distinctly value-based and independent of the well-established salience-driven and goal-driven mechanisms. Although value-driven attentional capture is often adaptive, facilitating the procurement of future rewards, it can become maladaptive when prior reward learning and ongoing goals conflict. Thus, our findings have additional implications for theories of attentional control in addiction.

 

A common discrete resource for visual working memory and visual search

Edward Awh, University of Oregon, Eugene

For decades, visual search has been a dominant paradigm for studying how observers deploy attention to find targets within scenes that contain varying numbers of non-target distractors.  Major models of visual search presume that working memory (WM) provides the “online” workspace for discriminating targets and distractors (e.g., Bundeson, 1990).  This hypothesis makes a clear prediction.  Individuals with high WM capacity should search more efficiently, because they should be able to simultaneously evaluate a larger number of search elements.  Nevertheless, despite vigorous efforts (Kane et al., 2006), no compelling evidence of such a correlation has emerged.  Here we provide multiple demonstrations of a robust correlation between WM capacity and search efficiency, and we document some key boundary conditions for observing this link.  In addition, we use a neural measure of visual selection capacity (the N2pc) to show that visual search and WM storage may be constrained by a common discrete resource.

 

Target-nontarget relations determine ‘bottom-up’ feature priming and ‘top-down’ contingent capture: A call for a unified account of top-down/bottom-up guidance?

Stefanie Becker, The University of Queensland & Ulrich Ansorge, The University of Vienna

Current models of visual search assume that attention is guided by the interplay of a top-down, feature-specific attentional system and a bottom-up, saliency-based attentional system. As such, current models have difficulties to account for feature priming effects, which seem to be outside top-down control and yet are feature-specific: In visual search for a pop-out target, attention is biased towards the target feature on the previous trial (e.g., red) even when observers know that the target feature will be different (e.g., green).

Furthermore, in search for a pop-out target, attention is not tuned to the specific feature value of the target, or determined by bottom-up saliency: Rather, attention is preferentially tuned to target-nontarget relations that specify how the target differs from the irrelevant nontargets (e.g., redder, darker). Guidance by target-nontarget relationships governs both ‘bottom-up’ feature priming effects and ‘top-down’ contingent capture. This suggests that both top-down and bottom-up guidance may be based on the same mechanism, questioning the strict distinction between top-down and bottom-up processing.

 

Neural mechanisms of the serial search process

Tim Buschman, Massachusetts Institute of Technology, Cambridge

Visual search is a complex behavior that builds upon several underlying computations, such as controlling attention, perceiving attended stimuli, and comparing stimuli to remembered targets.  How the brain organizes these components into a coherent behavior is not well understood.  I will present evidence on the neural mechanisms of one such component: the control of attention.  By leveraging large-scale, multiple-region electrophysiology in non-human primates trained to perform a visual search task, we demonstrate a key role for frontal cortex in directing attention (in particular the frontal eye fields).  Furthermore, we find both behavioral and neural evidence for a serial process underlying difficult visual search.  Interestingly, this serial process was also seen at the population level as an increase in 25Hz oscillations in the local field potential.  However, these oscillations were not limited to the frontal eye fields but were synchronized across frontal and parietal cortex.  This may act as a ‘clocking’ signal that organizes neural activity and computations across the several brain regions involved in visual search.

 

Endogenous and Exogenous Forces Compete to Guide Visual Search

Chris Donkin, Denis Cousineau, Richard M. Shiffrin, Indiana University, Bloomington

Important insights into visual search, visual attention, and their interaction can be gleaned from analysis and modeling of full response time distributions. Our model is based on a sequence of individual object comparisons. Target responses occur when a comparison finds a target; target absent responses are more complex and based on the need to balance the conflicting goals of speed and accuracy. At the heart of the model are the processes that determine the order of comparisons, based on a competition between forces that are endogenous (e.g. plans) and exogenous (e.g. attention to onsets, learned attention to consistent targets). The exogenous forces result from a parallel process of comparison to all display objects. This model is applied to data based on several novel search conditions (such as advance knowledge of display positions and sequential presentation of objects in displays), and data from studies that use stimuli of widely differing search difficulty. The model provides excellent predictions for full response time distributions for individuals, and insights into search and attention processes, and individual differences.

 

Attentional and motivational signals in the parietal cortex during visual search. 

Michael E. Goldberg,  Xiaolan Wang,  Annegret Falkner & Mingsha Zhang, Columbia University, New York

The lateral intraparietal area (LIP) provides a priority map of the visual world, which can be used by the oculomotor system to choose the goal of the next saccade, and by the visual system to pin attention in space.  In visual search it integrates visual, oculomotor, and cognitive signals to  create this map.  LIP activity is also modulated by reward – visual responses are greater when a monkey will get a greater reward for an impending saccade (Platt and Glimcher), and monkeys,  given a choice, make a saccade to a stimulus more likely to be rewarded (Sugrue, Corrado, and Newsome).

Here we show that motivation, independent of attention, modulates the activity of parietal neurons.  We trained monkeys to report the orientation (upright or inverted) of a capital T among lower-case t distractors and report this with a hand movement.   The task began with a 500 ms period in which the monkey fixated a central point waiting for the array to appear.  We interleaved two versions of the task: in the free search task the monkeys could move their eyes and make a saccade to the target.  In the fixation task the monkeys had to continue fixate and solve the task using peripheral vision.  The monkeys performed the free task 95% of the time, and the fixation task 70% of the time.

LIP activity during the fixation period predicted the monkey’s performance in the fixation task: it correlated with the monkey’s success on the task, and inversely with a recency-weighted measure of the monkey’s recent history of reward.  It was independent of the monkey’s locus of spatial attention: a 50% valid, task-irrelevant cue presented during the fixation period had no effect on the baseline signal when the cue was not in the receptive field of the neuron, although it had the expected validity effect both on the saccadic reaction time in the free task and the reaction time of the hand movement.  Baseline activity also correlated with the intensity of the response to the onset of the visual array, in a multiplicative manner.  We suggest that this baseline correlates with the monkey’s state of motivation (or arousal) and may be the cortical manifestation of ascending modulatory systems.

 

Brain mechanisms on attentional segmentation

Glyn Humphreys, University of Oxford

In many circumstances human search is efficient and little affected by irrelevant distractors. Typically such cases of efficient search have been thought of as being bottom-up in nature, determined by the perceptual saliency of a target in relation to its local surround. In contrast, we have been examining search conditions in which efficient segmentation of targets from distractors is determined – at least in part – by top-down guidance through the selective suppression of potentially competing distractors. I will present behavioural, brain imaging and neuropsychological data pointing to the role of posterior parietal cortex in  top-down segmentation along different stimulus attributes, distinguishing brain regions involved in top-down guidance through segmentation from those involved in guiding serial search.

 

Inattention in humans

Masud Husain, UCL Institute of Cognitive Neuroscience & Institute of Neurology, London

Inattention is pervasive. It affects you and me several times a day. More notably, it impacts severely on the lives of people with a wide range of neurological disorders.

In this talk I’ll discuss the contributions of different types of attention and working memory mechanisms to inattention and impaired visual search in patients with focal lesions and neurodegenerative conditions.

I’ll attempt to show how these behavioural studies, in concert with imaging data, allow us to build hypotheses regarding the contribution of brain regions to attention and working memory. Finally, I hope to show how it might be possible to modulate inattention and impulsivity with neuromodulatory drugs.

 

A Neural Theory of Visual Attention

Søren Kyllingsbæk, University of Copenhagen

The neural theory of visual attention (NTVA) developed by Bundesen, Habekost, and Kyllingsbæk (2005) is a neural interpretation of Bundesen’s (1990) theory of visual attention (TVA). The theory accounts both for a wide range of attentional effects in human performance (reaction times and error rates) and for a wide range of effects observed in firing rates of single cells in the primate visual system. NTVA provides a mathematical framework to unify the two fields of research—formulas bridging cognition and neurophysiology. I will present new theoretical ideas on the relations between allocation of spatial attention in time, processing resources, and visual short-term memory.

 

A dual-stage account of inter-trial priming in spatial and in temporal search

Dominique Lamy, Tel Aviv University

The study of inter-trial priming (ITP) effects in visual search has generated an increasing amount of research. Recently, we proposed a dual-stage account of ITP, according to which repetition of target-defining features speeds both an early, perceptual stage and a later, response-related stage of visual search (Lamy, Yashar & Ruderman, 2010). Here, we refine this account by showing that (1) ITP does not speed attentional prioritization, (2) ITP speeds attentional engagement in the target and (3) the response-releated component of ITP pertains to motor processes (repetition of the motor response) rather than to perceptual processes (repetition of the response feature). Finally, we show that ITP effects extend to repetition of temporal position, which leads us to suggest an implicit short-term memory account as an alternative to  existing models of the sequential effect of foreperiod (or warning signal).

 

States of control:  variability in salience-driven processing

Andrew Leber, University of New Hampshire, Durham

Recent efforts to examine the ebb and flow of attentional processing have yielded important insights about the cognitive and neural mechanisms underlying our capacity to resist distracting information.  Typically, these studies have examined variability in goal-driven sources of attentional control (e.g., stemming from frontoparietal brain regions).  Here, evidence will be presented showing that moment-to-moment variability in stimulus-driven processing also exerts a strong influence on one’s capacity to resist distraction.  Observers participated in an fMRI experiment in which they searched for a square among circles, and half of the trials contained an irrelevant moving distractor; this distractor reliably slowed RTs.  A whole-brain, data-driven analysis was then conducted to determine whether pre-trial baseline activity in any brain regions could predict the magnitude of the distraction effect.  Results revealed a region in middle temporal area (MT/MST) that could reliably predict moment-to-moment fluctuations in distraction; specifically, as pretrial activity increased, distraction effects increased.  This result demonstrates that greater activity in a sensory region’s baseline activity reduces observers’ abilities to attentionally reject stimuli that are subsequently processed by that region.  Additional converging evidence, from analysis of individual differences in both behavioral and neural measures, will be presented to further support this finding.

 

Neuroimaging of Visual Attention in Aging

David J. Madden, Duke University Medical Center

Neuroimaging studies of younger adults have established that a network of frontoparietal cortical regions is critical for supporting visual search performance, particularly when target detection is inefficient. Visual search performance is not a static cognitive ability but instead exhibits considerable variability related to adult development, which in turn may reflect age-related differences in brain structure and function.

In a recent fMRI study, we have investigated adult age differences in the use of top-down attention to resist distraction from a salient nontarget display item. The search task was a highly efficient form of target detection, in which display size was always five items, and a shape singleton target was either present or absent among nontarget shapes (e.g., a vertical bar target among four circle distractors). In this efficient search task, both younger and older adults were successful in using top-down attention (target predictability) to improve search performance. However, the older adults exhibited a disproportionate vulnerability to distraction from the color singleton, and target predictability did not provide any protection from distraction for either age group.

From the fMRI data we have identified two distinct but overlapping cortical networks related, respectively, to detection of the target and distraction from the color singleton. Variation in activation within these networks, particularly in superior parietal cortex, suggests that a salient distractor engages response competition mechanisms that are vulnerable to age-related decline.

Although top-down attentional guidance remains available to support target detection, during later adulthood, a highly salient distractor may engage response competition mechanisms that are less amenable to attentional control.

 

Experience Guided Search: A Bayesian Perspective On Attentional Control

Micheal Mozer, University of Colorado

Individuals are remarkably flexible in their ability to search for arbitrary target objects in cluttered visual environments.  According to the Guided Search model (Wolfe, 1994, 2007), attention can be deployed to likely target locations by control processes that modulate the contribution of low-level visual-feature activations to a saliency map.  Although Guided Search does not make strong claims about the control processes per se, optimization is often suggested as a principle of attentional control (e.g., Baldwin & Mozer, 2006; Cave, 1999; Navalpakkam & Itti, 2006; Rao et al., 2002; Wolfe, Butcher, Lee, & Hyle, 2003).  However, optimization-based approaches are typically poor accounts of human performance because they are too slow–requiring more trials to tune behavior than do individuals–and too smart–achieving better search efficiency than do individuals.  We propose a principled Bayesian formulation of Guided Search, called Experience-Guided Search (EGS), that overcomes these challenges by casting attentional control as an inference problem.  Over a sequence of trials, EGS collects statistics of its task environment which allow it to infer–via a probabilistic generative model of the environment–the discriminative value of features.  Because the mere trial sequence determines the degree to which features guide attention, EGS makes strong predictions concerning the relative efficiency of search.  EGS provides an elegant and parsimonious account of a wide range of visual search phenomena, including target-distractor similarity effects, sequential dependencies, and distractor proportion effects.  Moreover, testing a key prediction of EGS, we find evidence that priming affects search slopes in a target detection task. Moving beyond synthetic displays used in laboratory tasks, we have trained EGS to search for targets in naturalistic images, such as cars in street scenes.  Although attention is guided only by local features (vs. scene gist features), EGS does a reasonable job of locating targets and of replicating human fixation distributions.

This work is in collaboration with Bradley Mitchell, Matt Wilder, Brett Roads, Karthik Venkatesh, and David Baldwin

 

Pre-attentive and post-selective processing in ‘pop-out’ search tasks

H. J. Müller and the Munich group, Ludwig-Maximilians-Universität München

One central issue in visual pop-out search concerns which processes are operating re-attentively, i.e., prior to focal-attentional target selection, and which post-selectively, i.e., leading up to the required response. In this presentation, we will review behavioral and electrophysiological evidence that we have collected over the years, which favors a salience-based selection model in which salience computations are subject to (primarily) dimension-dependent inter-trial (i.e., bottom-up) and top-down modulations (the ‘dimension-weighting’ account). This notion also goes a long way to explain the capture of attention by irrelevant singleton distractors: our evidence suggests that capture is subject to (largely dimension-based) control processes dynamically adjusting interference from the distractor dimension on a fine temporal scale. Beyond this, our evidence shows that similar modulations also operate at post-selective stages of processing, and carry over across different types of tasks (e.g., detection, localization, compound task) given that these share similar component processes (the ‘multiple-weighting-systems’ hypothesis). While post-selective processing, i.e., the time to attentionally analyze the target and select an appropriate response, is heavily dependent on the task, pre-selective processes are not influenced by the task demands. One further issue to be addressed concerns under which search conditions inter-trial effects are feature-specific, as compared to dimension-specific in nature. Our evidence suggests that when displays are sparse, search works via a feature-based mode; but when the same displays are made dense (enabling feature contrast interactions), search tends to operate in singleton-mode and this is associated with dimension-based effects. In summary, our work argues that target selection in pop-out search tasks is saliency-based, where selection-relevant saliency signals integrate different sources of information (bottom-up: stimulus salience, and top-down: e.g., working memory influences). Post-selective processing is also subject to inter-trial and cueing effects, but these are additive to early, selection-based effects.

 

Attentional templates in visual search

Chris Olivers & Martin Eimer, VU University Amsterdam & Birkbeck College, University of London

Popular theories assume that potential target objects in visual search are prioritized on the basis of a match to a visual working memory representation, referred to as the template. In this twofold lecture, Chris will first review evidence converging on the idea that working memory can be, but is not always, sufficient for inducing an attentional bias towards that object. Nor is visual working memory necessary for such feature-based biases to occur, as more implicit episodic memories, as well as long-term feature associations induce very similar biases. Taken together, the evidence suggests that working memory tries to be as lazy as possible by specifying just the minimal task requirements, and letting other memory systems, or even the stimulus, fill in the template on a “need-to-know” basis. Martin will then discuss recent experiments that studied the time course and the efficiency of template-guided attentional object selection with ERP measures. Results demonstrate that attentional templates have several beneficial effects: They expedite target selection, prevent attentional capture by distractors, and regulate access to working memory. Other evidence shows that at any given moment, attentional templates specify only a single feature within a given dimension. Template-controlled attention shifts between visual objects can be initiated virtually instantaneously.

 

Visuospatial working memory supports the retrieval (or use) of contextual templates from long-term memory

Stefan Pollmann, Otto-von-Guericke University, Magdeburg

Contextual cueing is typically regarded as an implicit, automatic form of spatial learning which occurs in repeated search displays. However, recently, there have been behavioral studies suggesting that the search facilitation observed in repeated displays depends on visuospatial working memory resources. Working memory appears to support the expression of learning rather than the learning itself (Manginelli et al., Exp. Psychol. 2011; Vickery et al., JEP:HPP 2010). Furthermore, the implicit nature of contextual cueing has been called into question. Several studies showed that contextual cueing depends at least partly on explicit learning.

I will report fMRI studies that investigate the neural foundations of working memory involvement in contextual cueing on the one hand and explicit versus implicit contextual memory on the other hand. Furthermore, I will discuss studies on foveal viewing impairment (either by macular degeneration or gaze-contingent scotoma simulation) that suggest an interference between top-down gaze control and contextual cueing.

 

Neural Mechanisms of Attention

John Reynolds, Sake Institute for Biological Studies, La Jolla

Attention is a solution that evolution has found to a fundamental problem of perception: the sensory environment contains far more information than our perceptual systems can process at any moment in time. Natural selection has therefore endowed us with attentional mechanisms that enable us to flexibly select behaviorally relevant information to guide behavior. Research in my laboratory seeks to understand (1) how neural signals vary with attentional state, (2) what are the the neural mechanisms that give rise to these changes, and (3) how do attention-dependent changes in neuronal signaling alter perception?  We have found that, in addition to modulating mean firing rates, attention reduces neuronal response variability in macaque extrastriate visual cortex (Mitchell, Sundberg & Reynolds, 2007, 2009). In our hands, this accounts for the lion’s share (80%) of observed attention-dependent improvements in neuronal signal quality.  It is therefore essential to understand how attention reduces neuronal response variability.  This has led us on a journey into the neuron to examine how the intracellular mechanisms that determine its spiking behavior are modulated by attentional state.  The result is a new model of attention, in which attention-dependent changes in response variability and mean firing rate both result from increases in the excitatory and inhibitory synaptic conductances that drive neuronal activity.  The model predicts that when attention is directed into a neuron’s receptive field this will cause (1) a reduction in the neuron’s tendency to fire action potentials in bursts and (2) a reduction in the amplitude of the neuron’s action potentials.  We find that both these predictions hold in macaque visual area V4. The model thus offers a unified explanation of four distinct types of attentional modulation, two of them novel.

 

Rethinking the role of top-down selective attention in visual search

Ruth Rosenholtz, Massachusetts Institute of Technology

Difficulty performing a number of tasks, such as visual search, suggests that there is a bottleneck in visual processing.  According to the traditional view, at any given moment selective attention allows only a small portion of the visual input to get through the bottleneck for further processing.  Some processing can occur “preattentively” and guide this selection. Much of the early research on visual search focused on determining what processing could occur preattentively, and what required selective attention.

While this view of visual search and attention has held sway for many years, it has also been problematic.  My lab proposes an alternative, in which the visual system’s strategy for dealing with limited capacity focuses on compression of the visual input, rather than on selective attention.  I will demonstrate that this model can predict not only classic results in visual search, but also results that were problematic for the traditional selective attention story.

 

Stage theory of visual search: Gated accumulator model

Jeffrey Schall, Vanderbilt University, Nashville

Many lines of evidence demonstrate that representing salience and preparing saccades are dissociable because they are accomplished by different (although connected) populations of neurons.  I will summarize the evidence for this claim and describe a stochastic accumulator model that explains how visual search performance can be understood as a gated, feed-forward cascade from a salience map to multiple competing accumulators. The model quantitatively accounts for behavior and predicts neural dynamics of macaque monkeys performing visual search for a target stimulus among different numbers of distractors. The evidence accumulated in the model is equated with the spike trains recorded from visually-responsive neurons in the frontal eye field thought to encode stimulus salience. Accumulated variability in the firing rates of these neurons explains choice probabilities and the distributions of correct and error response times with search arrays of different set sizes if the accumulators are mutually inhibitory. The dynamics of the stochastic accumulators quantitatively predict the activity of presaccadic movement neurons that initiate eye movements if gating inhibition prevents accumulation before sufficient evidence about stimulus salience has emerged. Adjustments in the level of gating inhibition could control tradeoffs in speed and accuracy that optimize visual search performance.  However, new research from monkeys performing visual search under different speed-accuracy instructions demonstrate that the adjustments are more complex and surprising than expected by any current model.

 

Nonconscious working memory biases

David Soto, Imperial College London

I will present new behavioural data indicating that conscious awareness is a dispensable feature to observe top-down guidance of visual attention by the contents of working  memory. I will show that items matching the contents of working memory can bias attention even when they are not consciously seen. I will also show that subliminal visual signals can be maintained in working memory, even in the presence of conscious distracters, and guide perceptual decision making in a delayed discrimination task. Finally I will present evidence from functional MRI indicating a role of the anterior prefrontal cortex in this process of nonconscious working memory guidance. The results challenge the idea that working memory biases and prefrontal function must be linked to conscious awareness

 

Keeping information in mind: role of delay activity for working memory and attention

Mark Stokes , University of Oxford

Working memory is crucial for keeping in mind behaviourally relevant information for goal-direct action and decision making. In particular, maintaining perceptual information frees behaviour from direct stimulus-dependency, and maintaining the associated rules allow us to respond flexibly to the contents of our environment, past, present or future. Ever since the 1970s, it has been widely assumed that working memory is maintained via persistent activity to keep representations active even after the period of sensory stimulation. Moreover, sustained activation of task-relevant representation for working memory could also bias perceptual analysis of subsequent input, thereby providing a neurobiological basis for a biased competition model of attention. However, these influential ideas imply a static form of active maintenance, whereas recent neurophysiological evidence highlights dynamic neural activity over stationary brain states. Here, we consider recent evidence for dynamic population coding, and propose an alternative model in which temporary changes in synaptic efficiency provide the underlying stability for working memory. Neural firing is important for configuring the context-dependent connectivity states, but not necessarily for continuous maintenance of the content and rules that constitute working memory.  We also consider how these changes could mediate attentional biasing of on-going sensory processing.

 

Automatic selection

Jan Theeuwes, Vrije Universiteit Amsterdam

In the present presentation I will argue that the salience map that drives automatic selection is not only determined by raw physical salience of the objects in the environment but also by the way these objects are shaped by selection history. We provide evidence that priming (feature and reward priming) sharpens the cortical representation of these objects such that these objects appear to be more salient above and beyond their physical salience. We demonstrate that this type of priming is not under volitional control: it occurs even if observers try to volitionally prepare for something else. In other words, looking at red prepares our brain for things that are red even if we volitionally try to prepare for green.

 

Modeling Guidance in Scenes

Jeremy M. Wolfe, Harvard Medical School, Boston

The first generation of models of visual search were intended to account for search for a target placed in a random array of distractor items. In models like “Guided Search”, features of the target and, perhaps, the distractors were used to guide attention to the target. Thus, when searching for a red, vertical line, attention would be guided to “red” and “vertical”. In the real world, random arrays of items do occur (e.g. in that drawer in your kitchen). However, most real-world searches are searches through structured scenes. Structured scenes permit guidance based on the properties of the scene rather than properties of the target. Consider the search for a pot. “Syntactic” guidance, based on the structure of the scene, guides attention to horizontal surfaces that could hold a pot. “Semantic” guidance, based on the “meaning” of a scene, guides attention to appropriate scene regions (e.g. a stove) in preference to less appropriate, but syntactically legal surfaces (e.g. the floor). “Size-Depth” guidance, based on the mapping from 2D image to 3D space, guides attention to objects whose size in the 3D world is pot-like, regardless of their size in the 2D retinal image. Much work will be required to understand the interplay of classic feature guidance with these forms of scene-based guidance.

 

Measuring the interplay of working memory and long-term memory representations in the top-down control of attention during visual search

Geoffrey F. Woodman, Vanderbilt University

Although theories of attention propose that target representations are stored in working memory to bias attention mechanisms, we lack direct evidence from humans supporting this hypothesis.  I will discuss experiments using event-related potentials which show that when humans search complex visual scenes a representation of the target is maintained in visual working memory provided the identity of the searched-for item changes across trials.  However, when the target identity is stable across trials we can watch these working memory ‘attentional templates’ handed off to long-term memory.  This talk concludes by discussing how we are using these methods to address questions about the locus of top-down attentional control when visual search is performed in different contexts.

 

Modeling Guidance and Recognition in Categorical Visual Search

Gregory Zelinsky, Yifan Peng, Alexander Berg, & Dimitris Samaras,
Stony Brook University

Search consists of a process that compares a target representation to blurred patterns in the visual periphery for the purpose of generating a guidance signal, and a recognition process that verifies targets and rejects distractors, usually after fixation by gaze.  Do these component search tasks use the same visual features?  We addressed this question by training several SVM-based classifiers to describe both behaviors.  Observers did a present/absent categorical search for a teddy bear target in four-object arrays.  Target-absent trials consisted of random category objects ranked as visually target-similar, target-dissimilar, or medium-similarity, as described in Alexander & Zelinsky (2011, JOV).  Accuracy was high on target-present (95.3%) and target-absent (97.9%) trials, and guidance was quantified in terms of the object first fixated during search.  First fixations were most common on targets (79.4%), followed by target-similar (65.5%), medium (12.4%), and target-dissimilar (5.7%) distractors.  Bear/non-bear classifiers were trained using features ranging in biological plausibility (V1, C2, SIFT-BOW, SIFT-SPM), with each feature tested separately and in combination with color.  Training and testing was done on unblurred and blurred versions of objects in separate conditions.  Objects were blurred using TAM (Zelinsky, 2008, Psych Review), which approximated an object’s appearance in peripheral vision before the initial eye movement.  Patterns of categorical guidance and recognition accuracy were modeled almost perfectly by a linear SVM using C2 and color features; a simple and biologically-plausible model outperformed state-of-the-art computer vision models in a head-to-head comparison.  These results were obtained by training on unblurred objects, then testing on blurred objects (the pre-fixation viewing conditions existing at the time of guidance) and unblurred objects (the post-fixation viewing conditions existing at the time of recognition).  We also flipped these testing conditions (blurred objects for recognition and unblurred objects for guidance) to explore the effects of object blur on these tasks. Despite their use of different visual information, our findings suggest that search guidance and recognition may be mediated by the same features, making these processes more similar than what the search literature had previously believed.