Visual search
Encyclopedia
Visual search is a type of perceptual
task requiring attention
that typically involves an active scan of the visual environment for a particular object or feature (the target) among other objects or features (the distractors). Visual search can take place either with or without eye movements. Common examples include trying to locate a certain brand of cereal at the grocery store or a friend in a crowd (e.g. Where's Waldo?). The scientific study of visual search typically makes use of simple, well-defined search items such as oriented bars or colored letters. The cognitive architecture of the visual system is then assessed by establishing which factors affect the amount of time taken by the observer to indicate whether a search target is present or absent. One of the most common factors affecting such measures of reaction time (RT) relates to the number of distractors present in the visual search task. An increase in the number of distractors often leads to an increase in search RT and is thus also related to an increase in difficulty of the task (See Fig 1). The measure of the involvement of attention in the search task is often manifested as a slope of the response time function over the display size, or number of distractors (RT slope).
The right parietal visual cortex has been identified as being involved in conjunction searches. Ashbridge, Walsh, and Cowers in 1997, found that applying Transcranial Magnetic Stimulation
(TMS) to the right parietal cortex significantly impairs reaction time for conjunction searches 100 ms after the visual stimulation but not for feature searches. Furthermore, in 1995, Corbetta, Schulman, Miezin, and Petersen found that a part of the superior parietal cortex was activated during spatial attention shifts and visual feature conjunction searches but not feature searches by using Positron emission tomography
(PET).
, and functions by acting as a bias against the reorientation of visual attention to a previously cued attention. Further research suggests that inhibition of return is related to both the spatial location to which attention is going back to and to the object that occupies that location. Therefore, if the object moves, then the inhibition can also move with the object rather than remaining entirely at the initial location. In a study by Ro et al., (2003), a single pulse transcranial magnetic stimulation was used, and showed that the human frontal eye fields play a crucial role in the generation of inhibition of return.
One way to select information is obviously to orient to it, also known as visual orienting. One obvious way for visual orienting to take place is the overt movement of moving our head and eyes toward the visual stimuli. However, there are also brain mechanisms for visual orienting that do not require any overt changes in head or eyes. In the 1970s, it was found that cells in the parietal lobe increased their firing rate in response to stimuli in their receptive field when monkeys attended to peripheral stimuli even when no eye movements were allowed. It was also shown that humans could covertly shift attention to peripheral stimuli, and when they did so they responded more rapidly both at lower threshold and at enhanced electrical activity to the attended location. Lesions of the parietal lobe specifically damaged this covert orienting ability on the side of space opposite to the lesion. Therefore; these findings suggest that the parietal lobe is involved in covert orienting to visual stimuli.
Attention to visual stimuli can be thought of in terms of a "spotlight" that highlights a particular location in space with your attention. The “spotlight” can be attracted by a sudden change in the periphery. In other words, your attention is externally guided by a stimulus and this is known as exogenous orienting. However, there is also such thing as endogenous orienting which is when attention is guided by the goals of the perceiver. Thus, the focus of attention of the perceiver can be manipulated by the demands of a task. Visual search is a paradigm that uses endogenous orienting because participants have the goal to detect the presence or absence of a specific target object in an array of other distracting objects.
Evidence that attention and thus later visual processing is needed to integrate two or more features of the same object is shown by the occurrence of illusory conjunctions
, or when features do not combine correctly. For example, if a display of a green X and a red O are flashed on a screen so briefly that the later visual process of a serial search with focal attention cannot occur, the observer may report seeing a red X and a green O. Neural evidence links the parietal lobe to the correct integration of visual features. Friedman-Hill, Robertson, and Treisman in 1995 conducted a case study with a patient R.M. who had symmetrical bilateral parieto-occipital lesions and no temporal or frontal lobe damage. R.M. could recognize but not correctly integrate visual features.
An activation map is a representation of visual space in which the level of activation at a location reflects the likelihood that the location contains a target. This likelihood is based on preattentive , featural information of the perceiver. According to the Guided Search Theory, the initial processing of basic features produces an activation map, with every item in the visual display having its own level of activation. Attention is demanded based on peaks of activation in the activation map in a search for the target. Thus, search is efficient if the target generates the highest, or one of the highest activation peaks. For example, suppose someone is searching for red, horizontal targets. Feature processing would activate all red objects and all horizontal objects. Attention is then directed to items depending on their level of activation, starting with those most activated. This explains why search times are longer when distractors share one or more features with the target stimuli.
is involved during inefficient visual search. Patients with parietal lesions are impaired during inefficient but not efficient search directed to the opposite side of space to the lesion. Brain-imaging studies have also shown the activation of the posterior parietal cortex during visual spatial orienting.
All visual search tasks, as determined by Nobre et al., (2003), activated an extensive network of cortical regions in the parietal, frontal, and occipital cortex and the cerebellum. Multiple regions within the posterior parietal cortex were activated bilaterally, including the superior and inferior parietal lobules, and the intraparietal sulcus. Activation of multiple regions within the posterior parietal cortex further suggests multiple functional contributions by different parietal areas.
Studies using only covert visual search conditions found enhanced activation in multiple posterior parietal areas and in frontal areas during inefficient relative to efficient visual search. Therefore, the participation of posterior parietal and frontal brain areas in visual search are not constrained to their involvement in eye movements. Whereas, studies using overt visual search conditions have shown that areas in the superior parietal cortex and intraparietal sulcus are more active during inefficient visual search.
Findings by Nobre et al., (2003) confirmed that the posterior parietal cortex is indeed involved with visual search and that activation of the parietal lobe is mainly sensitive to the degree of efficiency in visual search but less sensitive to the binding of features. Search efficiency has a more substantial effect on participating brain regions in visual search than feature binding. Search efficiency exerts enhanced activity bilaterally in the superior parietal lobule, intraparietal sulcus, and in the right angular gyrus, in inefficient relative to efficient search conditions. Enhanced activity is also found in the frontal, occipital, and cerebellar regions. Frontal activations include the right dorsolater prefrontal cortex and bilateral ventrolateral premotor/prefrontal cortex. Feature binding exerts only sparse effects on brain activations. Search for conjunction targets compared to search for feature targets activates small clusters in the superior parietal lobule, which overlap with the activation from search efficiency. No brain region is selectively activated only by conditions requiring the binding of features.
The engagement of attention on a target is controlled by the pulvinar nucleus in the thalamus which blocks input from unattended stimuli. The superior colliculus
is involved in the movement of attention from one location to another and the disengagement of attention is governed by the parietal lobe so that another stimulus can be processed.
Studies have suggested similar mechanisms for the difficulty of older adults such as age related optical changes that influence peripheral acuity, the ability to move attention over the visual field, the ability to disengage attention, and the ability to ignore distractors.
A study by Lorenzo-López, Amenedo, Pascual-Marqui, and Cadaveira in 2008 provides neurological evidence for the fact that older adults have slower RT during conjunctive searches compared to young adults. Event Related Potentials (ERP) showed longer latencies and lower amplitudes in older subjects than young adults at the P3 component
, which is related to activity of the parietal lobes. This suggests the involvement of the parietal lobe function with an age-related decline in the speed of visual search tasks. Results also showed that older adults, when compared to young adults, had significantly less activity in the anterior cingulate cortex and many limbic and occipitotemporal regions that are involved in performing visual search tasks.
and face recognition are different, and that faces have a special significance. However, a large portion of this research focuses on the cognitive processes involved, with the aim of improving computerized face detection
and recognition
systems.
A large amount of evidence from the human retina
is made use of in human vision
, making it very detailed, and usually accurate. Evidence from edges, corners, lines, bars, blobs, intensity, shape, texture, colour, and motion have been well utilized in vision research; however, contextual knowledge that is exploited in human vision has been to some extent neglected in vision research.
, or the inability to recognize familiar faces while for the most part retaining the ability to recognize other objects. Prosopagnosia is caused by lesions in the ventral occipitotemporal cortex
, usually bilaterally, although a few cases involve only the right hemisphere.
Further evidence of specialized structures for face detection come from single-unit recording
studies on macaque
monkeys, which have identified neurons that are coded for faces (and some, for specific human faces) in the superior temporal sulcus and inferior temporal cortex.
of the Alzheimer type (DAT) in patients results in a significant benefit with spatial cueing, but this benefit is only obtained for cues with high spatial precision. In fact, the reduction in the dynamic range of spatial attention is so clear that only the smallest cue used in a study by Parasuraman et al., (2000) facilitated visual search speed in DAT patients.
Abnormal visual attention may underlie certain visuospatial difficulties in patients with Alzheimer’s disease (AD). Patients of AD have hypometabolism and neuropathology in parietal cortex, and given the role of parietal function for visual attention, patients with AD may have hemispatial neglect
. Hemispatial neglect is a deficit in attention and awareness of one side of space after damage to one hemisphere of the brain. In AD, hemispatial neglect on visual search tasks may relate to difficulty in disengaging attention in visual search.
An experiment conducted by Tales et al., (2000) concerned the ability of patients with AD to perform various types of efficient visual search tasks. They found that people with AD were significantly impaired overall in visual search tasks. Their results showed that search rates on the “pop-out” tasks were similar for both AD and control groups, however, people with AD searched significantly slower compared to the control group on the conjunction task. One interpretation of these results is that the visual system of AD patients has a problem with feature binding, such that it is unable to communicate efficiently the different feature descriptions for the stimulus. Features are typically analyzed in functionally and anatomically separate cortical areas in the brain, and an impairment of “binding” would result in an impaired ability to compare across these features. The binding of features is thought to be mediated by areas such as the temporal and parietal cortex, and these areas are known to be affected by AD-related pathology.
In conjunction search, healthy people are thought to employ grouping strategies among the distractors in order to reduce the need for attention shifting, thereby improving search efficiency. “Grouping” is different from “binding”. Grouping is an ability to jointly represent similar items distributed across space, whereas binding is an ability to jointly represent the different characteristics of a single item in visual space. It might be possible for AD patients to have a reduction in the efficiency of basic visual processing, such as grouping necessary to form a grouping strategy during visual search tasks. This would therefore result in a greater need for attention shifting in order to detect the target among the distractors. A third possibility for the impairment of people with AD on conjunction searches, is that there may be some damage to general attentional mechanisms in AD, and therefore any attention-related task will be affected, including visual search.
Tales et al., (2000) detected a double dissociation with their experimental results on AD and visual search. Earlier work was carried out on patients with Parkinson's disease
(PD) concerning the impairment patients with PD have on visual search tasks. In those studies, evidence was found of impairment in PD patients on the “pop-out” task, but no evidence was found on the impairment of the conjunction task. As discussed, AD patients show the exact opposite of these results: normal performance was seen on the “pop-out” task, but impairment was found on the conjunction task. This double dissociation provides evidence that PD and AD affect the visual pathway in different ways, and that the pop-out task and the conjunction task are differentially processed within that pathway.
orienting of attention, meaning that attention is guided by an external stimulus. A study by Swettenham, Milne, Plaisted, Campbell, and Coleman reported that autistic individuals had impaired endogenous
attention shifts but intact exogenous shifts. Therefore, since visual search tasks emphasize exogenous attention shifting, this explains how autistic individuals can have superior performance on these tasks. A second reason for Autistic individuals’ superior performance on visual search tasks is that they have superior performance in discrimination tasks between similar stimuli and therefore may have an enhanced ability to differentiate between items in the visual search display. A third suggestion is that Autistic individuals may have stronger top-down target excitation processing and stronger distractor inhibition processing than controls. O’Riordan, Plaisted, Driver, and Baron-Cohen.
Keehn, Brenner, Palmer, Lincoln, and Müller used an event-related functional magnetic resonance imaging design to study the neurofunctional correlates of visual search in autistic children and matched controls of typically developing children. Autistic children showed superior search efficiency and increased neural activation patterns in the frontal, parietal, and occipital lobes when compared to the typically developing children. Thus, Autistic individuals’ superior performance on visual search tasks may be due to enhanced discrimination of items on the display, which is associated with occipital activity, and increased top-down shifts of visual attention, which is associated with the frontal and parietal areas.
Perception
Perception is the process of attaining awareness or understanding of the environment by organizing and interpreting sensory information. All perception involves signals in the nervous system, which in turn result from physical stimulation of the sense organs...
task requiring attention
Attention
Attention is the cognitive process of paying attention to one aspect of the environment while ignoring others. Attention is one of the most intensely studied topics within psychology and cognitive neuroscience....
that typically involves an active scan of the visual environment for a particular object or feature (the target) among other objects or features (the distractors). Visual search can take place either with or without eye movements. Common examples include trying to locate a certain brand of cereal at the grocery store or a friend in a crowd (e.g. Where's Waldo?). The scientific study of visual search typically makes use of simple, well-defined search items such as oriented bars or colored letters. The cognitive architecture of the visual system is then assessed by establishing which factors affect the amount of time taken by the observer to indicate whether a search target is present or absent. One of the most common factors affecting such measures of reaction time (RT) relates to the number of distractors present in the visual search task. An increase in the number of distractors often leads to an increase in search RT and is thus also related to an increase in difficulty of the task (See Fig 1). The measure of the involvement of attention in the search task is often manifested as a slope of the response time function over the display size, or number of distractors (RT slope).
Types
Feature search
Feature Search is the process of searching a target which differs from the distractors by a unique visual feature, such as color, size, orientation or shape. For example, an O is quickly found among Xs, and a red target is quickly found if all the distracters are blue (See Fig 2). Results tend to be rapid because the unique feature “pops out”. Therefore, reaction time (RT) slopes tend to be shallow or flat, indicating that the number of distractors has minimal effect on RT when the target possesses an easily discriminable feature, such as color, size, or orientation that is different from the distractors. Physiological evidence suggests that specialized visual receptors respond to different visual features such as orientation, color, spatial frequency, or movement. These features are analyzed in the early stages of vision and are involved in mapping a representation of these features in different areas of the brain.Conjunction search
Conjunction Search is the process of searching for a target that is not defined by any single unique visual feature, but by a combination of two or more features. For example, the observer searches for an orange square among blue squares and orange triangles (See Fig 3). Results tend to be slower than in a feature search because one must integrate information of two visual features in order to locate the target. Therefore, RT slopes tend to be steep because of the large effect the number of distractors has on RT. According to the Feature Integration Theory (FIT) (discussed below), the steep slope caused by increased RT is due to the task of integrating different features which must be performed for each item in the display until the target is found. This is referred to as a serial search. For example, in the previously mentioned visual search task of locating the orange square among blue squares and orange triangles, one must integrate the colour (orange or blue) and shape (square or triangle) feature for every separate item in the display until the target is found.The right parietal visual cortex has been identified as being involved in conjunction searches. Ashbridge, Walsh, and Cowers in 1997, found that applying Transcranial Magnetic Stimulation
Transcranial magnetic stimulation
Transcranial magnetic stimulation is a noninvasive method to cause depolarization or hyperpolarization in the neurons of the brain...
(TMS) to the right parietal cortex significantly impairs reaction time for conjunction searches 100 ms after the visual stimulation but not for feature searches. Furthermore, in 1995, Corbetta, Schulman, Miezin, and Petersen found that a part of the superior parietal cortex was activated during spatial attention shifts and visual feature conjunction searches but not feature searches by using Positron emission tomography
Positron emission tomography
Positron emission tomography is nuclear medicine imaging technique that produces a three-dimensional image or picture of functional processes in the body. The system detects pairs of gamma rays emitted indirectly by a positron-emitting radionuclide , which is introduced into the body on a...
(PET).
Visual orienting
In Posner’s spatial cueing task used in research of spatial attention there is a slowing in reaction time with going back to the previously attended location. This is termed inhibition of returnInhibition of return
Inhibition of return refers to the observation that the speed and accuracy with which an object is detected are first briefly enhanced after the object is attended, and then detection speed and accuracy are impaired...
, and functions by acting as a bias against the reorientation of visual attention to a previously cued attention. Further research suggests that inhibition of return is related to both the spatial location to which attention is going back to and to the object that occupies that location. Therefore, if the object moves, then the inhibition can also move with the object rather than remaining entirely at the initial location. In a study by Ro et al., (2003), a single pulse transcranial magnetic stimulation was used, and showed that the human frontal eye fields play a crucial role in the generation of inhibition of return.
One way to select information is obviously to orient to it, also known as visual orienting. One obvious way for visual orienting to take place is the overt movement of moving our head and eyes toward the visual stimuli. However, there are also brain mechanisms for visual orienting that do not require any overt changes in head or eyes. In the 1970s, it was found that cells in the parietal lobe increased their firing rate in response to stimuli in their receptive field when monkeys attended to peripheral stimuli even when no eye movements were allowed. It was also shown that humans could covertly shift attention to peripheral stimuli, and when they did so they responded more rapidly both at lower threshold and at enhanced electrical activity to the attended location. Lesions of the parietal lobe specifically damaged this covert orienting ability on the side of space opposite to the lesion. Therefore; these findings suggest that the parietal lobe is involved in covert orienting to visual stimuli.
Attention to visual stimuli can be thought of in terms of a "spotlight" that highlights a particular location in space with your attention. The “spotlight” can be attracted by a sudden change in the periphery. In other words, your attention is externally guided by a stimulus and this is known as exogenous orienting. However, there is also such thing as endogenous orienting which is when attention is guided by the goals of the perceiver. Thus, the focus of attention of the perceiver can be manipulated by the demands of a task. Visual search is a paradigm that uses endogenous orienting because participants have the goal to detect the presence or absence of a specific target object in an array of other distracting objects.
Feature integration theory (FIT)
One popular explanation for the different RT of feature and conjunction searches is the Feature Integration Theory (FIT), which was introduced by Treisman and Gelade in 1980. This theory suggests that visual features such as color and shape are registered early, automatically, and are coded in parallel across the visual field without the use of attention. For example, a red X can be quickly found among any number of black X’s and O’s because the red X has a discriminative feature of colour and will “pop out”. In contrast, this theory also suggests that in order to integrate two or more visual features belonging to the same object, a later process involving integration of information from different brain areas is needed and is coded serially with focal attention. For example, when locating an orange square among blue squares and orange triangles, neither the colour feature "orange" nor the shape feature "square" is sufficient to locate the search target. Instead, one must integrate information of both colour and shape to locate the target.Evidence that attention and thus later visual processing is needed to integrate two or more features of the same object is shown by the occurrence of illusory conjunctions
Illusory conjunctions
Illusory conjunctions are psychological concepts where participants accidentally combine features of two objects into one object. There are both visual illusory conjunctions and auditory illusory conjunctions. Both types occur due to a lack of attention. Visual attention depends on fixation and...
, or when features do not combine correctly. For example, if a display of a green X and a red O are flashed on a screen so briefly that the later visual process of a serial search with focal attention cannot occur, the observer may report seeing a red X and a green O. Neural evidence links the parietal lobe to the correct integration of visual features. Friedman-Hill, Robertson, and Treisman in 1995 conducted a case study with a patient R.M. who had symmetrical bilateral parieto-occipital lesions and no temporal or frontal lobe damage. R.M. could recognize but not correctly integrate visual features.
Guided search
Pre-attentive processes exist to direct attention to the interesting locations in the visual field. There are two ways in which preattentive processes can be used to direct attention: bottom-up processing (stimulus-driven) and top-down processing (user-driven). In the Guided Search Model by Jeremy Wolfe, information from top-down and bottom-up processing of the stimulus is used to create a ranking of items in order of their attentional priority. In a visual search, attention will be directed to the item with the highest priority. If that item is rejected, then attention will move on to the next item and the next, and so forth. The guided search theory follows that of parallel search processing.An activation map is a representation of visual space in which the level of activation at a location reflects the likelihood that the location contains a target. This likelihood is based on preattentive , featural information of the perceiver. According to the Guided Search Theory, the initial processing of basic features produces an activation map, with every item in the visual display having its own level of activation. Attention is demanded based on peaks of activation in the activation map in a search for the target. Thus, search is efficient if the target generates the highest, or one of the highest activation peaks. For example, suppose someone is searching for red, horizontal targets. Feature processing would activate all red objects and all horizontal objects. Attention is then directed to items depending on their level of activation, starting with those most activated. This explains why search times are longer when distractors share one or more features with the target stimuli.
Relationship with the parietal cortex
Visual search can proceed efficiently or inefficiently. During efficient search, performance is unaffected by the number of distractor items. The reaction time functions are flat, and the search s is assumed to be a parallel search. In contrast, during inefficient search, the reaction time to identify the target increases linearly with the number of distractor items present. The posterior parietal cortexPosterior parietal cortex
The posterior parietal cortex plays an important role in producing planned movements. Before an effective movement can be initiated, the nervous system must know the original positions of the body parts that are to be moved, and the positions of any external objects with which the body is going to...
is involved during inefficient visual search. Patients with parietal lesions are impaired during inefficient but not efficient search directed to the opposite side of space to the lesion. Brain-imaging studies have also shown the activation of the posterior parietal cortex during visual spatial orienting.
All visual search tasks, as determined by Nobre et al., (2003), activated an extensive network of cortical regions in the parietal, frontal, and occipital cortex and the cerebellum. Multiple regions within the posterior parietal cortex were activated bilaterally, including the superior and inferior parietal lobules, and the intraparietal sulcus. Activation of multiple regions within the posterior parietal cortex further suggests multiple functional contributions by different parietal areas.
Studies using only covert visual search conditions found enhanced activation in multiple posterior parietal areas and in frontal areas during inefficient relative to efficient visual search. Therefore, the participation of posterior parietal and frontal brain areas in visual search are not constrained to their involvement in eye movements. Whereas, studies using overt visual search conditions have shown that areas in the superior parietal cortex and intraparietal sulcus are more active during inefficient visual search.
Findings by Nobre et al., (2003) confirmed that the posterior parietal cortex is indeed involved with visual search and that activation of the parietal lobe is mainly sensitive to the degree of efficiency in visual search but less sensitive to the binding of features. Search efficiency has a more substantial effect on participating brain regions in visual search than feature binding. Search efficiency exerts enhanced activity bilaterally in the superior parietal lobule, intraparietal sulcus, and in the right angular gyrus, in inefficient relative to efficient search conditions. Enhanced activity is also found in the frontal, occipital, and cerebellar regions. Frontal activations include the right dorsolater prefrontal cortex and bilateral ventrolateral premotor/prefrontal cortex. Feature binding exerts only sparse effects on brain activations. Search for conjunction targets compared to search for feature targets activates small clusters in the superior parietal lobule, which overlap with the activation from search efficiency. No brain region is selectively activated only by conditions requiring the binding of features.
The engagement of attention on a target is controlled by the pulvinar nucleus in the thalamus which blocks input from unattended stimuli. The superior colliculus
Superior colliculus
The optic tectum or simply tectum is a paired structure that forms a major component of the vertebrate midbrain. In mammals this structure is more commonly called the superior colliculus , but, even in mammals, the adjective tectal is commonly used. The tectum is a layered structure, with a...
is involved in the movement of attention from one location to another and the disengagement of attention is governed by the parietal lobe so that another stimulus can be processed.
Effects of aging
There is a vast amount of research indicating that performance in conjunctive visual search tasks significantly improves during childhood and declines in later life. More specifically, young adults have been shown to have faster RT on conjunctive visual search tasks than both children and older adults, but their RTs were similar for feature visual search tasks. This suggests that there is something about the process of integrating visual features or serial searching that is difficult for children and older adults, but not for young adults. Studies have suggested numerous mechanisms involved in this difficulty in children including peripheral visual acuity, eye movement ability, ability of attentional focal movement, and the ability to divide visual attention among multiple objects.Studies have suggested similar mechanisms for the difficulty of older adults such as age related optical changes that influence peripheral acuity, the ability to move attention over the visual field, the ability to disengage attention, and the ability to ignore distractors.
A study by Lorenzo-López, Amenedo, Pascual-Marqui, and Cadaveira in 2008 provides neurological evidence for the fact that older adults have slower RT during conjunctive searches compared to young adults. Event Related Potentials (ERP) showed longer latencies and lower amplitudes in older subjects than young adults at the P3 component
P300 (neuroscience)
The P300 wave is an event related potential elicited by infrequent, task-relevant stimuli. It is considered to be an endogenous potential as its occurrence links not to the physical attributes of a stimulus but to a person's reaction to the stimulus. More specifically, the P300 is thought to...
, which is related to activity of the parietal lobes. This suggests the involvement of the parietal lobe function with an age-related decline in the speed of visual search tasks. Results also showed that older adults, when compared to young adults, had significantly less activity in the anterior cingulate cortex and many limbic and occipitotemporal regions that are involved in performing visual search tasks.
Face recognition
A growing body of research attests to the fact that the processes underlying objectObject recognition
Object recognition in computer vision is the task of finding a given object in an image or video sequence. Humans recognize a multitude of objects in images with little effort, despite the fact that the image of the objects may vary somewhat in different view points, in many different sizes / scale...
and face recognition are different, and that faces have a special significance. However, a large portion of this research focuses on the cognitive processes involved, with the aim of improving computerized face detection
Face detection
Face detection is a computer technology that determines the locations and sizes of human faces in arbitrary images. It detects facial features and ignores anything else, such as buildings, trees and bodies....
and recognition
Facial recognition system
A facial recognition system is a computer application for automatically identifying or verifying a person from a digital image or a video frame from a video source...
systems.
A large amount of evidence from the human retina
Retina
The vertebrate retina is a light-sensitive tissue lining the inner surface of the eye. The optics of the eye create an image of the visual world on the retina, which serves much the same function as the film in a camera. Light striking the retina initiates a cascade of chemical and electrical...
is made use of in human vision
Visual perception
Visual perception is the ability to interpret information and surroundings from the effects of visible light reaching the eye. The resulting perception is also known as eyesight, sight, or vision...
, making it very detailed, and usually accurate. Evidence from edges, corners, lines, bars, blobs, intensity, shape, texture, colour, and motion have been well utilized in vision research; however, contextual knowledge that is exploited in human vision has been to some extent neglected in vision research.
Approaches
- Top-downTop-down and bottom-up designTop–down and bottom–up are strategies of information processing and knowledge ordering, mostly involving software, but also other humanistic and scientific theories . In practice, they can be seen as a style of thinking and teaching...
based approach – postulates that there is a different face model at different levels, on a coarse-to-fine scale. An image is searched at the coarsest scale first, for efficiency, and once a match is found, searches at the second coarsest scale, and the next, until the most detailed scale is reached. It is difficult to extend this approach to multiple views since it is generally assumed that there is only one face model (in the fronto-parallel view) for each level of the scale. - Bottom-upTop-down and bottom-up designTop–down and bottom–up are strategies of information processing and knowledge ordering, mostly involving software, but also other humanistic and scientific theories . In practice, they can be seen as a style of thinking and teaching...
feature-based approach – facial features are searched for individually and then geometrically grouped into 'face candidates'. This approach lends itself to searching multiple views, but would be unable to accommodate different imaging conditions due to variation in image structure of facial features. - Texture-based approach: Spacial distribution of the gray-level information in Haralick's subimage matrices are analysed to detect faces. This method is not easily applied to different viewpoints.
- Neural networkNeural networkThe term neural network was traditionally used to refer to a network or circuit of biological neurons. The modern usage of the term often refers to artificial neural networks, which are composed of artificial neurons or nodes...
approach – faces are detected through subsampling of different regions of an image to a standard-sized subimage, and passing this information through a neural network filter. Fronto-parallel faces appear to be accommodated in the algorithm, but not when extended to different views of faces; profile view detection is still entirely outside the scope of the algorithm. - Colour-based approach – individual pixels are labelled according to skin colour similarity, and subregions containing a large blob of skin coloured pixels are subsequently labelled. This approach generalizes to face viewpoint but can be defeated by face shape or skin colour.
- Motion-based approach – the moving foreground is extracted from the static background through image subtraction; silhouette or colour of the differenced image are then used to locate faces. Multiple moving objects will seriously disrupt this approach.
Involvement of neural substrates
The first indication of a specialized system for face detection was the existence of ProsopagnosiaProsopagnosia
Prosopagnosia is a disorder of face perception where the ability to recognize faces is impaired, while the ability to recognize other objects may be relatively intact...
, or the inability to recognize familiar faces while for the most part retaining the ability to recognize other objects. Prosopagnosia is caused by lesions in the ventral occipitotemporal cortex
Cerebral cortex
The cerebral cortex is a sheet of neural tissue that is outermost to the cerebrum of the mammalian brain. It plays a key role in memory, attention, perceptual awareness, thought, language, and consciousness. It is constituted of up to six horizontal layers, each of which has a different...
, usually bilaterally, although a few cases involve only the right hemisphere.
Further evidence of specialized structures for face detection come from single-unit recording
Single-unit recording
In neurophysiology and neurology, single-unit recording is the use of an electrode to record the electrophysiological activity from a single neuron.-History:...
studies on macaque
Macaque
The macaques constitute a genus of Old World monkeys of the subfamily Cercopithecinae. - Description :Aside from humans , the macaques are the most widespread primate genus, ranging from Japan to Afghanistan and, in the case of the barbary macaque, to North Africa...
monkeys, which have identified neurons that are coded for faces (and some, for specific human faces) in the superior temporal sulcus and inferior temporal cortex.
Face categorization studies
- Visual search paradigms have been employed in face recognition tasks that involve judging the presence of an intact face in a limited situation (an array of jumbled faces). Valentine and Bruce found that typical and upright faces were detected faster than distinctive and inverted faces, respectively. Northdurft used schematic faces to determine that reaction time for detection of an intact face increases as a function of the number of distractors in the array. This seemed to indicate that face detection occurs through serial, rather than parallel, searching, and that the 'pop out' effect does not occur with faces.
- More recent studies, however, which involved face detection in a more natural scene, provide evidence that pop-out does occur with faces, suggesting that, in naturalistic settings, faces are detected through a parallel pre-attentive search strategy.
Face-detection effect studies
- The face-detection effect suggests that faces have a special significance even in early processing stages. The face-detection effect, discovered by Purcell and Stewart, is the finding that upright faces are detected more easily than alternate, visually matched images (e.g. an inverted or jumbled face).
- Further evidence for the special significance of the face was found in studies by Vuilleumier, who worked with left spatial neglect patients, and Ro et al., whose experiment was based on a flicker-paradigm design. These studies concluded that faces automatically command attention when competing with other stimuli.
- More recent experiments by Lewis and Edmonds have determined that faces are detected more quickly in natural than in scrambled scenes, that obscuring eyes negatively effects reaction times in face detection, that upright, high-contrast, and clear faces are detected faster than inverted, low-contrast, and blurred faces respectively, and that inversion, hue reversal, and luminance reversal slow the process of face detection in an additive manner.
Effect of Alzheimer's
DementiaDementia
Dementia is a serious loss of cognitive ability in a previously unimpaired person, beyond what might be expected from normal aging...
of the Alzheimer type (DAT) in patients results in a significant benefit with spatial cueing, but this benefit is only obtained for cues with high spatial precision. In fact, the reduction in the dynamic range of spatial attention is so clear that only the smallest cue used in a study by Parasuraman et al., (2000) facilitated visual search speed in DAT patients.
Abnormal visual attention may underlie certain visuospatial difficulties in patients with Alzheimer’s disease (AD). Patients of AD have hypometabolism and neuropathology in parietal cortex, and given the role of parietal function for visual attention, patients with AD may have hemispatial neglect
Hemispatial neglect
Hemispatial neglect, also called hemiagnosia, hemineglect, unilateral neglect, spatial neglect, unilateral visual inattention, hemi-inattention or neglect syndrome is a neuropsychological condition in which, after damage to one hemisphere of the brain, a deficit in attention to and awareness of...
. Hemispatial neglect is a deficit in attention and awareness of one side of space after damage to one hemisphere of the brain. In AD, hemispatial neglect on visual search tasks may relate to difficulty in disengaging attention in visual search.
An experiment conducted by Tales et al., (2000) concerned the ability of patients with AD to perform various types of efficient visual search tasks. They found that people with AD were significantly impaired overall in visual search tasks. Their results showed that search rates on the “pop-out” tasks were similar for both AD and control groups, however, people with AD searched significantly slower compared to the control group on the conjunction task. One interpretation of these results is that the visual system of AD patients has a problem with feature binding, such that it is unable to communicate efficiently the different feature descriptions for the stimulus. Features are typically analyzed in functionally and anatomically separate cortical areas in the brain, and an impairment of “binding” would result in an impaired ability to compare across these features. The binding of features is thought to be mediated by areas such as the temporal and parietal cortex, and these areas are known to be affected by AD-related pathology.
In conjunction search, healthy people are thought to employ grouping strategies among the distractors in order to reduce the need for attention shifting, thereby improving search efficiency. “Grouping” is different from “binding”. Grouping is an ability to jointly represent similar items distributed across space, whereas binding is an ability to jointly represent the different characteristics of a single item in visual space. It might be possible for AD patients to have a reduction in the efficiency of basic visual processing, such as grouping necessary to form a grouping strategy during visual search tasks. This would therefore result in a greater need for attention shifting in order to detect the target among the distractors. A third possibility for the impairment of people with AD on conjunction searches, is that there may be some damage to general attentional mechanisms in AD, and therefore any attention-related task will be affected, including visual search.
Tales et al., (2000) detected a double dissociation with their experimental results on AD and visual search. Earlier work was carried out on patients with Parkinson's disease
Parkinson's disease
Parkinson's disease is a degenerative disorder of the central nervous system...
(PD) concerning the impairment patients with PD have on visual search tasks. In those studies, evidence was found of impairment in PD patients on the “pop-out” task, but no evidence was found on the impairment of the conjunction task. As discussed, AD patients show the exact opposite of these results: normal performance was seen on the “pop-out” task, but impairment was found on the conjunction task. This double dissociation provides evidence that PD and AD affect the visual pathway in different ways, and that the pop-out task and the conjunction task are differentially processed within that pathway.
Effects of Autism
In line with previous research, O’Riordan, Plaisted, Driver, and Baron-Cohen showed that autistic individuals performed better and thus with lower RT than matched controls without autism in feature and conjunctive visual search tasks. Several explanations for this phenomenon have been suggested. Firstly, it is suggested that visual search tasks most likely involve exogenousExogenous
Exogenous refers to an action or object coming from outside a system. It is the opposite of endogenous, something generated from within the system....
orienting of attention, meaning that attention is guided by an external stimulus. A study by Swettenham, Milne, Plaisted, Campbell, and Coleman reported that autistic individuals had impaired endogenous
Endogenous
Endogenous substances are those that originate from within an organism, tissue, or cell. Endogenous retroviruses are caused by ancient infections of germ cells in humans, mammals and other vertebrates...
attention shifts but intact exogenous shifts. Therefore, since visual search tasks emphasize exogenous attention shifting, this explains how autistic individuals can have superior performance on these tasks. A second reason for Autistic individuals’ superior performance on visual search tasks is that they have superior performance in discrimination tasks between similar stimuli and therefore may have an enhanced ability to differentiate between items in the visual search display. A third suggestion is that Autistic individuals may have stronger top-down target excitation processing and stronger distractor inhibition processing than controls. O’Riordan, Plaisted, Driver, and Baron-Cohen.
Keehn, Brenner, Palmer, Lincoln, and Müller used an event-related functional magnetic resonance imaging design to study the neurofunctional correlates of visual search in autistic children and matched controls of typically developing children. Autistic children showed superior search efficiency and increased neural activation patterns in the frontal, parietal, and occipital lobes when compared to the typically developing children. Thus, Autistic individuals’ superior performance on visual search tasks may be due to enhanced discrimination of items on the display, which is associated with occipital activity, and increased top-down shifts of visual attention, which is associated with the frontal and parietal areas.
Works cited
- Ashbridge, V. Walsh, A. Cowey, 1997. Temporal aspects of visual search studied by transcranial magnetic stimulation, Neuropsychologia, 35: 1121–1131.
- Aglioti, S., Smania, N., Barbieri, C., and Corbetta, M. 1997. Influence of stimulus salience and attentional demands on visual search patterns in hemispatial neglect. Brain Cogn. 34: 388–403.
- Arguin, M., Cavanagh, P., and Joanette, Y. 1994. Visual feature integration with an attention deficit. Brain Cogn. 24: 44–56.
- Brown V, Huey D, & Findlay J M. (1997). Face detection in peripheral vision: do faces pop out? Perception Vol 26, pp. 1555–1570.
- Corbetta, M., Miezin, F. M., Shulman, G. L., and Petersen, S. E. 1993. A PET study of visuospatial attention. J. Neurosci. 13: 1202–1226.
- Corbetta, M., Kincade, J. M., Ollinger, J. M., McAvoy, M. P., and Shulman, G. L. 2000. Voluntary orienting is dissociated from target detection in human posterior parietal cortex. Nat. Neurosci. 3: 292–297.
- Corbetta, M., Shulman, G. L., Miezin, F. M., and Petersen, S. E. 1995. Superior parietal cortex activation during spatial attention shifts and visual feature conjunction. Science 270: 802–805.
- Donner, T. H., Kettermann, A., Diesch, E., Ostendorf, F., Villringer, A., and Brandt, S. A. 2000. Involvement of the human frontal eye field and multiple parietal areas in covert visual selection during conjunction search. Eur. J. Neurosci. 12: 3407–3414.
- Donner, T. H., Kettermann, A., Diesch, E., Ostendorf, F., Villringer, A., and Brandt, S. A. 2002. Visual feature and conjunction searches of equal difficulty engage only partially overlapping frontoparetal networks. NeuroImage 15: 16–25.
- Eglin, M., Robertson, L. C., and Knight, R. T. 1991. Cortical substrates supporting visual search in humans. Cereb. Cortex. 1: 262– 272.
- Elgavi-Hershler O, & Hochstein S. (2002). Vision at a glance: a high-level pop-out effect for faces. Perception Vol 31, Supplement 20.
- Farah M J, Wilson K D, Drain M, & Tanaka J W. (1998). What is 'special' about face perception? Psychological Review Vol 105, pp. 482 – 498.
- Friedman-Hill, S. R., Robertson, L. C., and Treisman, A. 1995. Parietal contributions to visual feature binding: Evidence from a patient with bilateral lesions. Science 69: 853–855.
- Gitelman, D. R., Nobre, A. C., Parrish, T. B., LaBar, K. S., Kim, Y. H., Meyer, J. R., and Mesulam, M. 1999. A large-scale distributed network for covert spatial attention: Further anatomical delineation based on stringent behavioural and cognitive controls. Brain 122: 1093–1096.
- Haxby, James V., Hoffman, Elizabeth A., & Gobbini, Ida M.. (2000). The distributed human neural system for face perception. Trends in Cognitive Sciences, Vol 4-6, pp. 223 – 233.
- Hopfinger, J. B., Buonocore, M. H., and Mangun, G. R. 2000. The neural mechanisms of top-down attentional control. Nat. Neurosci. 3: 284–291.
- Leonards, U., Suneart, S., Van Hecke, P., and Orban, G. 2000. Attention mechanisms in visual search—An fMRI study. J. Cogn. Neurosci. 12: 61–75.
- Lewis M. B., & Edmonds A. J.. (2002). Localisation and detection of faces in naturalistic scenes. Perception Vol 31, Supplement 19.
- Lewis, Michael B., & Edmonds, Andrew J.. (2003). Face Detection: Mapping Human Performance. Perception Vol 32, pp. 903 – 920.
- Mendez, M. F., Cherrier, M. M., and Cymerman, J.S. 1997 Hemispatial neglect on visual search tasks in alzheimer’s disease. Cognitive and behavioural neurology, 10(3).
- Nobre, A. C., Sebestyen, G. N., Gitelman, D. R., Mesulam, M. M., Frackowiak, R. S., and Frith, C. D. 1997. Functional localization of the system for visuospatial attention using positron emission tomography. Brain 120: 515–533.
- Nobre, A. C., Sebestyen, G. N., Gitelman, D. R., Frith, C. D., and Mesulam, M. M. 2002. Filtering of distractors during visual search studied by positron emission tomography. NeuroImage 16: 968–976, doi:10.1006/nimg.2002.1137.
- Nobre, A.C., Coull, V., and Frith, C. D. 2003. Brain activations during visual search: contributions of search efficiency versus feature binding. Neuro Image 18: 91-103.
- Nothdurft H-C. (1993). Faces and facial expressions do not pop out. Perception Vol 22 pp. 1287–1298.
- Parasuraman, R., Greenwood, P. M., and Alexander, G. E. Alzheimer disease constricts the dynamic range of spatial attention in visual search. Neuropsychologia, 2000; 38, 1126-1135.
- Posner, M. I., & Dehaene, S. 1994. Attentional Networks. Volume 17, Issue 2, Pages 75–79.
- Purcell D G, & Stewart A L. (1986). The face-detection effect. Bulletin of the Psychonomic Society Vol 24, pp. 118 – 120.
- Purcell D G, & Stewart A L. (1988). The face-detection effect: Configuration enhances perception. Perception & Psychophysics Vol 43, pp. 355 – 366.
- Purcell D G, & Stewart A L. (1991). The object-detection effect: Configuration enhances perception. Perception & Psychophysics Vol 50, pp. 215 – 224.
- Ro T, Russell C, & Lavie N. (2001). Changing faces: A detection advantage in the flicker paradigm. Psychological Reports Vol 12, pp. 94 – 99.
- Ro, T., Farn, A., Chang, E. 2003. Inhibition of return and the human frontal eye fields. Exp Brain Res, 150: 290-296.
- Tales, A., Butler, S. R., Fossey, J., Gilchrist, I. D., Jones, R. W., Troscianko, T. 2002. Visual search in Alzheimer’s disease: a deficiency in processing conjunctions of features. Neuropsychologia, 40: 1849-1857.
- Tipper, S.P., Driver, J., & Weaver, B. 1991. Object-centred inhibition of return of visual attention. Quarterly Journal of Experimental Psychology, 43A: 289-298.
- Treisman A, & Gelade G. (1980). A feature integration theory of attention. Cognitive Psychology Vol 12, pp. 97 – 136.
- Trick, L.M., & Enns, J.T. (1998). Life-span changes in attention: The visual search task. Cognitive Development, 13(3), 369-386.
- Troscianko T, Calvert J. 1993. Impaired parallel visual search mechanisms in Parkinson's disease: implications for the role of dopamine in visual attention. Clinical Vision Science, 8:281–7.
- Valentine T, & Bruce V. (1986). The effects of distinctiveness in recognizing and classifying faces. Perception Vol 15, pp. 525 – 533.
- Vuilleumier P. (2000). Faces call for attention: evidence from patients with visual extinction. Neuropsychologia Vol 38, pp. 693 – 700.
- Weinstein A, Troscianko T, Calvert J. 1997. Impaired visual search mechanisms in Parkinson's disease (PD): a psychophysical and event-related potentials study. Journal of Psychophysiology, 11: 33–47.
- Wolfe, J.M. 1994. Guided Search 2.0: A revised model of visual search. Psychonomic Bulletin and Review, 1(2): 202-238.
- Keehn, B., Brenner, L., Palmer, E., Lincoln, A.J., & Müller, R.-A. (2008). Functional brain organization for visual search in autism spectrum disorder. Journal of the International Neuropsychological Society, 14, 990–1003.
- O'Riordan, M.A., Plaisted, K.C., Driver, J., & Baron-Cohen, S. (2001). Superior visual search in autism. Journal of Experimental Psychology: Human Perception and Performance, 27(3), 719-730.
- Swettenham, J., Milne, E., Cambell, R., & Plaisted, K. (2000). Visuospatial orienting in response to social stimuli. Journal of Cognitive Neuroscience, 55(Suppl. D), 96-97.
- Plaisted, K., O'Riordan, M., & Baron-Cohen, S. (1998a). Enhanced discrimination of novel, highly similar stimuli by adults with autism during a perceptual learning task. Journal of Child Psychology and Psychiatry, 39, 765-775.
External links
- http://psytoolkit.leeds.ac.uk/lessons/visualsearch.html Demonstration and lesson about visual search using PsyToolkit (software)