Jeremy Wolfe – Anne Treisman’s legacy and the future of visual search
For most researchers in the visual search trade, Anne Treisman’s work was foundational. Whether you agreed or disagreed with her, you could not ignore the body of data and theory that she created. In this talk, I will review some of my agreements and disagreements with Treisman’s Feature Integration Theory. My Guided Search theory, in its various incarnations, was the product of my fruitful interaction with Anne. For the most part, our arguments dealt with tasks where observers looked for one target amongst a set of items randomly distributed on an otherwise blank background. In the second part of the talk, I will consider whether the rules that govern those tasks are relevant when we search in real scenes, when we might be searching for more than one type of target, and when we don’t know how many instances of targets might be present in the search stimulus. The answer will be a qualified “yes”. In the third section, if I have not exhausted the allotted time and the patience of the audience, I will discuss some of the problems posed by socially important search tasks like cancer screening and consider whether basic behavioral research has solutions to offer.
Session 1: Search guidance and attentional capture
Steven Luck (Keynote) – Mechanisms for the suppression of irrelevant objects during visual search
We have long known that attention can be directed toward items containing task-relevant feature values. But can attention also be directed away from irrelevant features (i.e., features indicating than an item is a nontarget)? In this presentation, I will review recent studies indicating that items containing distinctive nontarget feature values can be suppressed so that they attract attention less than “neutral” items. This mechanism can be used to suppress salient singletons, as assessed with psychophysics, eye tracking, and ERPs (with significant correlations among these measures, suggesting that they all reflect the same underlying mechanism). This mechanism can also be used to suppress nonsalient distractor items. However, the suppression mechanism does not appear to be under direct voluntary control. First, if observers are cued to avoid a specific color, the first eye movement tends to be directed to the to-be-avoided color. Second, the suppression appears to build up over trials. Third, if automatic priming from the previous trial is put into competition with explicit cuing of the to-be-avoided color, priming wins and suppression loses. The emerging picture is that explicit goals can direct attention toward but not away from specific feature values, but goal-driven experience with target and distractor features can lead to automatic suppression of to-be-avoided features.
Charles Folk – Semantic templates and attentional capture
Over the last 25 years, research on attentional guidance and capture has focused on the relative influence of bottom-up salience, top-down set, and more recently, selection history. An implicit assumption in much of this work has been that attentional guidance is limited to preattentively processed feature information (e.g., color, orientation, brightness, etc.). For example, a color singleton might capture attention based on a low level, salient, feature contrast, and that capture might be modulated by a top-down set for a particular color value. However, a growing number of studies looking at visual search in naturalistic scenes suggest that semantic/categorical information can have a dramatic impact on overt attention allocation as measured by eye movements. In addition, there is strong evidence that emotional content (independent of featural content) can produce evidence of attentional capture. Here we address whether attentional capture by semantic/categorical content is limited to emotional stimuli, or whether establishing a top-down set or template for semantic information can result in the contingent capture of attention by stimuli matching the semantic set. A series of behavioral and electrophysiological studies using an RSVP methodology will be reviewed that explore the degree to which natural images depicting exemplars from superordinate categories can elicit the capture of covert attention, and whether such capture is contingent on a top-down set for the relevant category.
Joy Geng – The role of context in shaping dynamic attentional templates
Theories of attention commonly refer to the “attentional template” as the collection of features in working memory that represent the target of visual search. Many models of attention assume that the template contains a veridical representation of target features, particularly when the target is defined by just one feature (e.g., a single color). However, recent studies have shown that the target representation can be “shifted” away from the true target value in order to optimize their distinctiveness from distractors and facilitate visual search (Navalpakkam and Itti, 2007; Becker, 2010; Geng et al., 2017). Despite these demonstrations, it remains unclear what conditions produce specific changes in the target representation. Here, I describe experiments in which we have investigated how the target template changes as a function of distractor context, stimulus complexity, and individual differences. Our data indicate that the “tuning” of the template is shaped by a number of factors, but its distance from distractors always predicts visual search performance.
Chris Olivers – Proactive and reactive control over target selection in visual search
Searching for more than one type of target often, but not always, results in switch costs. Using a gaze-contingent eye-tracking paradigm in which we instruct participants to simultaneously look for two target objects presented among distractors, we find that the occurrence of switch costs depends on target availability. When both targets are available in a display, thus giving the observer free choice on what to look for, little to no switch costs occur. In contrast, clear switch costs emerge when only one of the two targets is available, so that the target object is being imposed. This pattern occurs within and across various stimulus dimensions, and can be explained by assuming limited active attentional guidance in combination with a role for different types of cognitive control in visual search. While full target availability allows for proactive control over target selection, single target availability requires reactive control in response to unanticipated targets. I will furthermore present combined eye-tracking + fMRI and eye-tracking + EEG studies tracing both source and dynamics of these different control processes in visual search.
Dominique Lamy – Attentional capture without attentional engagement: a camera metaphor of attention
Most models of spatial attention assume that attention operates like a spotlight and that stimuli appearing in the focus of attention are mandatorily processed. Here, we show that when an irrelevant object captures attention, the shift of attention can be shallow and not followed by attentional engagement. In three sets of experiments, we measured spatial shifts of attention to an irrelevant distractor (or cue) as enhanced performance when the target appeared at the same vs. at a different location relative to the cue and attentional engagement as enhanced performance when the response-relevant feature at the cued location was compatible vs. incompatible with the target’s response feature. We found that (1) attentional shifts to irrelevant onsets were followed by attentional engagement at the cued location only with relevant-color and not with irrelevant-color onsets (contingent attentional engagement); (2) Attentional shifts to relevant-color cues were independent of conscious perception of the cue, whereas attentional engagement was contingent on it; (3) Attentional shifts to relevant-color cues were unaffected by the attentional blink, whereas attentional engagement was reduced and the N2pc component of the ERP suppressed.
We discuss the implications of these findings for the distinction between stimulus-driven and goal-dependent attentional capture, the mechanisms indexed by the N2pc and more broadly, models of spatial attention. In particular, we suggest that attention operates like a camera, which requires both aligning the zoom lens and pushing the shutter button, rather than like a spotlight.
Alejandro Lleras – Search efficiency for targets defined by two feature dimensions can be predicted based on search efficiency measures for targets defined along a single dimension
A new model for efficient visual search (Contrast Signal Theory – CST) is proposed whereby the goal of early parallel processing is to compute a contrast signal between the target template in memory and each item in the display. This architecture allows the visual system to compute fast and confident decisions about items in the display that are sufficiently different from the target such that parallel, peripheral evaluation of these items is sufficient to discard them as non-targets. In this model, the logarithmic search observed when a target is sufficiently different from lures is proposed to be inversely proportional to that [lure-target] contrast signal, such that evidence accumulation will accrue faster at locations where contrast is larger (i.e., the lure-target similarity is low) than where contrast is smaller (lure-target similarity is high). The Contrast Signal Theory has shown some early successes: it allows one to predict RTs for heterogeneous displays based on performance observed in homogeneous displays. Here, we ask: can search efficiency for targets that differ from distractors along two dimensions (color and shape) be predicted by the search efficiency observed for targets that differ from distractors along a single dimension (only differ in color or only differ in shape)? Predictions from various models are compared. Results from ten experiments show that there is a simple equation to derive the combined ([color x shape]) search efficiency based on the search efficiency observed along individual dimensions ([color] & [shape]).
Session 2: Search guidance based on (acquired) ST/LT memory / selective attention in visual WM
Kia Nobre (Keynote) – Memory and attention: the back and forth
Intuition tells us that memory is about moving back, to retrieve the past, whereas attention is about moving forth, to anticipate the future. In my talk I will argue that these arrows of time are misleading, and suggest instead that memory and attention work together in tightly knit complementary ways to link past and future in the service of guiding adaptive behaviour.
Geoff Woodman – Context triggers the retrieval of targets stored in long-term memory
How do we know what we are looking for in familiar scenes and surroundings? One proposal from theories of human memory is that visual working memory buffers mnemonic contents retrieved from long-term memory. The retrieved contents can then form an online mental representation (i.e., an attentional template) to control and guide attention. In the present study, we tested the hypothesis that context triggers the retrieval from long-term memory of the possible targets given that context. For example, being in the drivers seat of a car triggers the retrieval of road hazards. Here we recorded human subjects electroencephalogram (EEG) while they searched for objects on different colored backgrounds. Subjects searched for different sets of 1, 2, 4, and 6 unique real-world objects with each target set size paired with a different search context color. While learning, they also had to hold the search set of objects in mind during a blank delay to perform a visual search task at the end of each trial. After learning, subjects performed the visual search tasks in the different color contexts. During this final phase, we found that the colored backgrounds elicited EEG and event-related potentials of visual working memory maintenance that recapitulated the set size of the objects that people were to look for on that background. These results support the idea that contextual retrieval cues are sufficient for people to pull information out of long-term memory and into working memory to guide attention.
Roy Luria – An object based pointer system underlying visual working memory ability to access its online representations
The world around us constantly changes, posing a difficult challenge for our visual system that needs to constantly modify the information it represents accordingly. This process is done by Visual working memory (VWM) that is able to access a specific representation and modify it according to changes in the environment.
We argue that in order to access and modify the corresponding information, each representation within the VWM workspace must be stably mapped to the relevant stimuli. The idea of such a “pointer system” has been theoretically proposed in the past (e.g., FINST, Pylyshyn, 2000), but empirical support for it was largely limited to a tracking task, in which the only relevant information was spatial.
First, we provide evidence that VWM relies on such a pointer system in a shape change detection task, in which spatial information is task-irrelevant. By manipulating the pointer’s stability, we demonstrated that the loss of a pointer was accompanied by stable electrophysiological and behavioral markers, allowing us to use them as signatures of the pointer system. Next, we examined how the pointer system operates. Specifically, we asked whether pointers are allocated based on a spatial, featural, or object-based code. The results indicate that the pointer system relies on objecthood information to map and access each VWM representation.
Jan Theeuwes – Statistical learning drives visual selection
Lingering biases of attentional selection affect the deployment of attention above and beyond top-down and bottom-up control. In this talk I will present an overview of recent studies investigating how statistical learning regarding the distractor determines attentional control. In all these experiments we used the classic additional singleton task in which participants searched for a salient shape singleton while ignoring a color distractor singleton. The distractor singleton was presented more often in one location than in all other locations. Even though observers were not aware of the statistical regularities, we show that the location of the distractor was suppressed relative to all other locations. Moreover, we show that this learning in highly flexible and adaptive. We argue that selection history modulates the topographical landscape of spatial ‘priority’ maps, such that attention is biased towards locations having a high activation and biased away from locations that are suppressed.
Leonardo Chelazzi – Plasticity of priority maps of space
In the past we have pioneered research with human participants exploring the impact of reward on visual selective attention. For example, in a recent study using visual search, we have demonstrated that reward can alter the “landscape” of spatial priority maps, increasing priority for locations associated with greater reward during a learning phase and reducing it for locations associated with smaller reward. Importantly, we could also demonstrate that the effects persisted for several days after the end of the learning episode, during an extinction phase, and generalized to new tasks and stimuli. With an ongoing program of research, we are now assessing whether similar effects can be induced via statistical learning. In a series of experiments using variants of a visual search task, unbeknownst to the participants, we manipulate the probability of occurrence of the sought target and/or of a salient distractor across locations. The evidence indicates that, similar to the influence of reward, uneven probabilities of the critical items alter deployment of attention in a way that can optimize performance under certain conditions but can hinder it under other conditions. We argue that these effects reflect durable changes in priority maps of space. Importantly, in all cases above, changes in attentional performance were obtained even though participants had no clue as to the adopted manipulation. Future studies will try to understand whether reward-based learning and statistical learning operate via shared or independent mechanisms. In summary, reward and statistical learning appear to be strong (and implicit) determinants of attentional deployment.
Hermann J. Müller – Learning to shield search from salient distractors: dimension-based mechanisms of distractor suppression
Recently, there has been a growing interest in the mechanisms that permit observers to mitigate the interference effects of salient, additional-singleton distractors in visual search, including (statistical) learning to suppress distractors that consistently appear at likely, as compared to unlikely, distractor locations in the display. We will report a set of experiments which collectively indicate that mechanisms of dimension-based shielding of search from distraction play a role in these phenomena. First, we show that distractors defined in a different dimension to the target (e.g., orientation-defined target, luminance-defined distractor) cause less interference than distractors defined in the same dimension as the target (orientation-defined target, orientation-defined distractor), even though the defining distractor feature is perfectly predictable and clearly separable from the target feature in both cases. Second, we show that observers can learn to down-modulate a whole display region (rather than just a specific, single location) where a distractor is likely (vs. unlikely) to occur, both with different-dimension and with same-dimension distractors. Third, we present evidence suggesting that the signal down-modulation may operate at a different level with distractors defined in a different vs. the same dimension to the target: a dimension-based (spatial) level vs. a supra-dimensional (priority-map) level. In particular: spatial suppression also affects target processing, but only with same-dimension distractors – indicating that different-dimension distractors are suppressed on the dimensional level. This appears to be at variance with other studies that found such a target-location effect even for different-dimension distractors. We will discuss possible factors responsible for these discrepant findings, including the probabilistic ‘cueing’ of a single distractor location vs. a whole distractor region as well as the role of practice on the task. Together, our evidence suggests that, while the shielding of search from salient distractors ultimately works via the priority map, suppression at the dimension-based level (prior to signal integration by the priority map) offers a ready strategy when distractors are defined in a non-target dimension.
Session 3: Brain mechanisms of visual search
Jeff Schall (Keynote) – Neural Control of Visual Search
This presentation will survey performance, neural and computational findings demonstrating that gaze is guided during visual search through the operation of distinct stages of visual selection and saccade preparation. These stages can be selectively manipulated though target-distractor similarity, stimulus-response mapping rules, and unexpected perturbation of the visual array. Such manipulations indicate that they are instantiated in different neural populations with distinct connectivity and functional properties. Race and accumulator models provide a comprehensive account of the saccade preparation stage and of the conversion of salience evidence into saccade commands.
John Serences – Focal attention leads to warping of population codes in visual cortex
Visual search is often likened to shining a spotlight of attention at different positions in the visual field. Yet even convert attention to a single position, without search across the visual field, induces widespread modulations of neural populations with receptive fields (RFs) across the entire visual field. To investigate this ‘warping’ of spatial RFs, we used fMRI to examine changes in single voxel receptive fields (vRFs) and corresponding changes in the precision of representations based on larger populations of voxels arrayed across entire visual areas. We find that attention leads to large-scale modulations of vRF gain, size, and position, with position shifts contributing more to population-level enhancements of visual information than changes in vRF size or gain. These findings demonstrate that attending to even a single position leads to spatially widespread modulations in visual cortex, and that shifts in the position of RFs are a principal mechanism by which spatial attention enhances population codes for relevant visual information. This poses particular challenges for labeled line theories of information processing, suggesting that downstream regions likely rely on distributed inputs rather than single neuron-to-neuron mappings.
Daniel Baldauf – Functional connectivity mechanisms of attention
The neural mechanisms of spatial attention, via feedback signals from spatially-mapped control areas in frontal / parietal cortex, have been described in much detail. For non-spatial attention to different sensory modalities, complex objects, and so on, the control mechanisms seem much more complex and experimental work has just begun to identify possible sources of top-down control in the inferior part of frontal cortex. Obviously, however, spatial and non-spatial attention is often combined in everyday tasks. How these different control networks work together is a major question in cognitive neuroscience. To answer these remaining questions, we combined MEG and fMRI data in human subjects to identify not only the sources for spatial and non-spatial feedback signals, but also the mechanisms by which these different networks interact with sensory areas in attention. We identified two separable networks in the superior- and inferior-frontal cortex, mediating spatial versus non-spatial attention, respectively. Using multi-voxel pattern analysis, we found spatial and non-spatial information are represented in different subpopulations of frontal cortex. Most importantly, our analyses of temporally high-resolving MEG data also show that both control structures engage selectively in coherent interactions with sensory areas that represent the attended stimulus. Rather than a zero-phase lag connection, which would indicate common input, the interactions between frontal cortex and sensory areas are phase-shifted to allow for a 20ms transmission time. This seems to be just the right time for signals in one area to arrive at a time of maximum depolarization in the connected area, increasing their impact. Further, we were able to identify top-down directionality of these oscillatory interactions, establishing the superior- versus inferior-frontal cortex as key sources of spatial versus non-spatial attentional inputs, respectively.
Douglas P. Munoz – Neural Circuits for Saliency, Priority and Orienting
Since its introduction almost 30 years ago, saliency-map theory has attracted wide spread attention. The concept of a priority map arose as an extension of this idea to include top-down, goal-dependent input in a combined representation of visual saliency and behavioral relevancy, which is thought to determine attention and gaze. Models of visual attention postulate that a bottom-up saliency map is formed early in the visual processing stream. Although studies have reported evidence of a saliency map in various cortical brain areas, determining the contribution of phylogenetically older pathways is crucial to understanding its origin. We compared saliency coding from neurons in two early gateways into the visual system, the primary visual cortex (V1) and the evolutionarily older superior colliculus (SC). We found that, while visual signals reached V1 sooner than the SC superficial visual layers (SCs), the saliency representation emerged earlier in SCs than in V1. Because the dominant input to the SCs arises from V1, these relative timings are consistent with the hypothesis that SCs neurons pool the inputs from multiple V1 neurons to form a feature-agnostic saliency map, which may then be relayed to other brain areas. If the salient stimulus or stimuli become the target for a future saccade then neurons in the intermediate layers of the SC (SCi) develop a robust code of priority. How saliency and priority are coded in the brain has largely been restricted to simple laboratory stimuli presented to subjects performing stereotypical tasks. I will also highlight recent evidence contrasting neural processing of saliency and priority in the SC during natural free viewing of videos. The take home message is that the SCs robustly codes saliency, while the SCi robustly codes priority.
Talia Konkle – Predicting visual search from the representational architecture of high-level visual cortex
While many prominent models of visual search focus on characterizing how attention is deployed, it is also clear that representational factors contribute to visual search speeds, such as target-distractor similarity (Duncan and Humphreys, 1989). In this line of work, we examined the extent to which performance on a visual search task can be predicted from the stable representational architecture of the visual system, independent of attentional dynamics. Overall, we found strong brain/behavior correlations across most of the higher-level visual system, including both the ventral and dorsal pathways when considering both macro-scale sectors as well as smaller meso-scale regions. These results suggest that visual search for real-world object categories is well predicted by the stable, task-independent architecture of the visual system.
Jacqueline Gottlieb – The economics of search and attention: which target should I be looking for?
Studies of attention and visual search have focused on how we find targets given that we have been instructed to what to attend. But in natural behavior we do not have an instructor. Instead, our brains autonomously decide which sensory stimulus is most useful in a situation. Behavioral studies suggest that the brain makes these decisions based on cost-benefit analyses that consider the information gains, reward associations and costs (difficulty) related to finding and discriminating sensory cues. I will describe findings from our laboratory that begin to unravel how these quantities are represented in individual cells. While our studies have focused on the lateral intraparietal area (LIP), they suggests that many of the relevant processes depend on structures beyond the classical priority maps, thus linking the study of attention with traditionally separate topics of motivation, decision making and cognitive control.
Session 4: New data and models of visual search
Gregory Zelinsky (Keynote) – Predicting goal-directed attention control: A tale of two deep networks
The ability to control the allocation of attention underlies all goal-directed behavior. Here two recent efforts are summarized that apply deep learning methods to model this core perceptual-cognitive ability.
The first of these is Deep-BCN, the first deep neural network implementation of the widely-accepted biased-competition theory (BCT) of attention control. Deep-BCN is an 8-layer deep network pre-trained for object classification, one whose layers and their functional connectivity are mapped to early-visual (V1, V2/V3, V4), ventral (PIT, AIT), and frontal (PFC) brain areas as informed by BCT. Deep-BCN also has a superior colliculus and a frontal-eye field, and can therefore make eye movements. We compared Deep-BCN’s eye movements to those made by 15 people performing a categorical search for one of 25 target categories of common objects and found that it predicted both the number of fixations during search and the saccade-distance travelled before search termination. With Deep-BCN, a DNN implementation of BCT now exists that can be used to predict the neural and behavioral responses of an attention control mechanism as it mediates a goal-directed behavior—in our study the eye movements made in search of a target goal.
The second model of attention control is ATTNet, a deep network model of the ATTention Network. ATTNet is similar to Deep-BCN in that both have layers mapped to early-visual and ventral brain structures in the attention network and are aligned with BCT. However, they differ in two key respects. ATTNet includes layers mapped to dorsal structures, enabling it to learn how to prioritize the selection of visual inputs for the purpose of directing a high-resolution attention window. But a more fundamental difference is that ATTNet learns to shift its attention as it greedily seeks out reward. Using deep reinforcement learning, an attention shift to a target object elicits reward that makes all the network’s states leading up to that covert action more likely to occur in the future. ATTNet also learns to prioritize the visual input so as to efficiently control the direction of its focal routing window—the colloquial spotlight of attention. It does this, not only to find reward faster, but also to restrict its visual inputs to potentially rewarding patterns for the purpose of improving classification success. This selective routing behavior was quantified as a “priority map” and used to predict the gaze fixations made by 30 subjects searching 240 images from Microsoft COCO (the dataset used to train ATTNet) for a target from one of three object categories. Both subjects and ATTNet showed evidence for attention being preferentially directed to target goals, behaviorally measured as oculomotor guidance to the targets. Other well-established findings in the search literature were observed.
In summary, ATTNet is the first behaviorally-validated model of attention control that uses deep reinforcement to learn to shift a focal routing window to select image patterns. This is theoretically important in that it shows how a reward-based mechanism might be used by the brain to learn how to shift attention. Deep-BCN is also theoretically important in being the first deep network designed to capture the core tenant of BCT: that a top-down goal state biases a competition among object representations for the selective routing of a visual input, with the purpose of this selective routing being greater classification success. Together, Deep-BCN and ATTNet begin to explore the space of ways that cognitive neuroscience and machine learning can blend to form a new computational neuroscience, one harnessing the power and promise of deep learning.
Ruth Rosenholtz – Capacity limits and how the visual system copes with them
Our visual system cannot process everything with full fidelity, nor, in a given moment, perform all possible visual tasks. Rather, it must lose some information, and prioritize some tasks over others. A number of strategies have developed for dealing with this limited capacity. A popular proposal posits limited access to higher-level processing; that a mechanism known as selective attention serially gates access to that resource; and that the gate operates early in visual processing. However, since this account was originally proposed, we as a field have learned a great deal about capacity limits in vision. I will discuss the implications for selective attention theory. Furthermore, I will examine what we have learned from studying an alternative mechanism for dealing with limited capacity: efficient coding, particularly in the visual periphery. In this scheme, visual processing has limited bandwidth rather than limited access to higher-level processing. Finally, evidence suggests that we should look for additional capacity limits late in processing, taking the form of general-purpose limits on the complexity of the tasks one can perform at a given moment. A general-purpose decision process may deal with such limits by “cutting corners” when the task becomes too complicated.
Melissa Le-Hoa Võ – The role of anchor objects in guiding real-world search
General scene knowledge (our “scene grammar”) plays an important role in both identifying and locating objects in the real world. This knowledge reflects co-occurrences of scene elements and their structural regularities. When trying to locate an object, predicting the spatial relationship between various objects within a single scene is key for efficient search performance. We propose that the arrangement of objects is not only rule-governed, but hierarchical in its structure. In particular, we believe that some objects within each scene category function as anchors, carrying strong spatial predictions regarding other objects within the scene (e.g. the stove anchors the position of the pot). Therefore, these “anchors” constitute key elements in the hierarchy of objects in scenes and allow to efficiently guide search in real-world scenes. To test this hypothesis and to quantify the spatial relationship between objects in different scene categories, we extracted the spatial locations of objects from an image database. Inspired by graph theory, we captured the relationship of objects as a set of nodes connected by edges of varying weights. Based on these weights and combined with cluster analyses, we identified “anchor” objects. We tested the behavioral relevance of the weight parameters by correlating them with search performance in a different set of scenes. Results show that reaction time decreases as weights increase and that swapping anchors impeded search. We take this as first evidence that anchors play an important role in guiding search through naturalistic scenes.
Monica Castelhano – The Surface Guidance Framework: How Scene Surface can inform Search Strategies
The spatial relationship between objects and scenes and its effects on visual search performance has been well-established. In previous studies, we have shown that the spatial relationship can be exploited to explain eye movement patterns, to explain how initial scene representations affect subsequent search performance, and to distinguish the contribution of spatial vs. semantic information.
Using the newly proposed Surface Guidance Framework, we operationalize target relevant and irrelevant scene regions. We divide scenes into three regions (upper, mid, lower) that correspond with possible relevant surfaces (wall, countertop, floor). Target relevant regions are defined as the region a target object is expected (e.g., painting, toaster, rug). Here, we explore how relevant and irrelevant regions of a scene are processed in two classic visual search paradigms (set size and sudden onset) to further explore mechanisms of attention during search in scenes.
In Study 1, we explored how spatial associations affect search by manipulating search size in both target relevant or target irrelevant regions. We found that only set size increases in target relevant regions adversely affected search performance. In Study 2, we manipulated whether a suddenly-onsetting distractor object appeared in a target relevant or target irrelevant region. We found that fixations to the distractor were significantly more likely to occur in the target relevant condition and negatively affected search performance.
The Surface Guidance Framework allows us further explore how spatial associations can narrow processing to specific areas of the scene relevant to the task. Viewing effects of scene context through the lens of target relevancy allows us to develop new understanding of how the spatial relationship between objects and scenes can affect performance and processing.
Arni Kristjánsson – New insights from visual foraging tasks into visual attention and visual working memory
The assessment of the functional properties of visual attention and visual working memory has in past decades been dominated by single-target visual searches. But our goals from one moment to the next are unlikely to involve only a single target, and more recently, paradigms involving visual foraging for multiple targets have been used to investigate visual attention and working memory. Set-size effects in single-target visual search tasks partly form the foundation of many theories of visual search. We therefore manipulated set-size in a visual foraging task, involving both “feature” and “conjunction” foraging. The target selection times during foraging revealed specific components of the foraging pattern indicating that single-target search tasks only provide a snapshot of visual attention. Foraging tasks can also provide insights into the operational principles of visual working memory, and our results indicate that participants are able to change their foraging patterns according to task demands suggests that visual working representations used for attentional guidance are flexible, but not restricted to a single value as some current theories suggest. Our results show how single-target visual search tasks vastly undersample the operation of visual attention and visual working memory, providing only a snap-shot of the function of visual attention and visual working memory and this limited information is bound to be reflected in theoretical accounts based on such tasks.
Michael Hout – Passive search strategies improve attentional guidance and object recognition during demanding visual search
Hybrid visual memory search (i.e., search for more items than can be maintained in working memory) requires observers to search both through a visual display and through the contents of memory in order to find designated “target” items (e.g., walking through the grocery store looking for items on your grocery list, airport baggage screeners looking for many prohibited items in travelers’ luggage). A substantial body of research on this task has shown that observers are able to search for a very large number of items with relative ease. However, the attentional mechanisms that drive hybrid search remain somewhat unclear. In our first two experiments, we investigated the role that cognitive strategies play in facilitating hybrid search for categorically-defined targets. We hypothesized that observers in a hybrid search task would naturally adopt a strategy in which they remain somewhat passive, allowing targets to “pop out,” rather than actively directing their attention around the visual display. Experiment 1 compared behavioral responses in passive, active, and uninstructed hybrid search. Contrary to our expectations, we found that uninstructed search tended to be active in nature, but we also found that adopting a passive strategy led to more efficient performance. In Experiment 2, we replicated these findings, and tracked the eye movements of observers. We found that oculomotor behavior in passive hybrid search was characterized by faster, larger saccades, a tendency to fixate fewer non-target items, and an improved ability to classify items as either targets or distractors. In Experiment 3, we explored whether the benefits of passive search were limited only to particularly demanding search tasks (i.e., those that require observers to search for many items at once), or if performance benefits also appear when people are asked to find a single, categorically-defined target. Once again, we tracked the eye movements of participants and found strikingly similar results to our hybrid search task. Namely, that passive searchers were faster and less accurate, but more efficient overall. Additionally, passive search led to improved attentional guidance, better object recognition, and fewer target recognition failures. Together, our results indicate two surprising findings. First, that hybrid visual search is more active in nature than expected, and second, that adopting a passive search strategy leads to performance and oculomotor improvements during hybrid and single-target search. These findings fill a gap in the literature regarding the nature of strategy use during visual search, and the potential benefits of strategy adoption during challenging search tasks.