921 resultados para Pattern Recognition, Visual
Resumo:
Photo-mosaicing techniques have become popular for seafloor mapping in various marine science applications. However, the common methods cannot accurately map regions with high relief and topographical variations. Ortho-mosaicing borrowed from photogrammetry is an alternative technique that enables taking into account the 3-D shape of the terrain. A serious bottleneck is the volume of elevation information that needs to be estimated from the video data, fused, and processed for the generation of a composite ortho-photo that covers a relatively large seafloor area. We present a framework that combines the advantages of dense depth-map and 3-D feature estimation techniques based on visual motion cues. The main goal is to identify and reconstruct certain key terrain feature points that adequately represent the surface with minimal complexity in the form of piecewise planar patches. The proposed implementation utilizes local depth maps for feature selection, while tracking over several views enables 3-D reconstruction by bundle adjustment. Experimental results with synthetic and real data validate the effectiveness of the proposed approach
Resumo:
Neuronal oscillations are an important aspect of EEG recordings. These oscillations are supposed to be involved in several cognitive mechanisms. For instance, oscillatory activity is considered a key component for the top-down control of perception. However, measuring this activity and its influence requires precise extraction of frequency components. This processing is not straightforward. Particularly, difficulties with extracting oscillations arise due to their time-varying characteristics. Moreover, when phase information is needed, it is of the utmost importance to extract narrow-band signals. This paper presents a novel method using adaptive filters for tracking and extracting these time-varying oscillations. This scheme is designed to maximize the oscillatory behavior at the output of the adaptive filter. It is then capable of tracking an oscillation and describing its temporal evolution even during low amplitude time segments. Moreover, this method can be extended in order to track several oscillations simultaneously and to use multiple signals. These two extensions are particularly relevant in the framework of EEG data processing, where oscillations are active at the same time in different frequency bands and signals are recorded with multiple sensors. The presented tracking scheme is first tested with synthetic signals in order to highlight its capabilities. Then it is applied to data recorded during a visual shape discrimination experiment for assessing its usefulness during EEG processing and in detecting functionally relevant changes. This method is an interesting additional processing step for providing alternative information compared to classical time-frequency analyses and for improving the detection and analysis of cross-frequency couplings.
Resumo:
The processing of human bodies is important in social life and for the recognition of another person's actions, moods, and intentions. Recent neuroimaging studies on mental imagery of human body parts suggest that the left hemisphere is dominant in body processing. However, studies on mental imagery of full human bodies reported stronger right hemisphere or bilateral activations. Here, we measured functional magnetic resonance imaging during mental imagery of bilateral partial (upper) and full bodies. Results show that, independently of whether a full or upper body is processed, the right hemisphere (temporo-parietal cortex, anterior parietal cortex, premotor cortex, bilateral superior parietal cortex) is mainly involved in mental imagery of full or partial human bodies. However, distinct activations were found in extrastriate cortex for partial bodies (right fusiform face area) and full bodies (left extrastriate body area). We propose that a common brain network, mainly on the right side, is involved in the mental imagery of human bodies, while two distinct brain areas in extrastriate cortex code for mental imagery of full and upper bodies.
Resumo:
We propose a probabilistic object classifier for outdoor scene analysis as a first step in solving the problem of scene context generation. The method begins with a top-down control, which uses the previously learned models (appearance and absolute location) to obtain an initial pixel-level classification. This information provides us the core of objects, which is used to acquire a more accurate object model. Therefore, their growing by specific active regions allows us to obtain an accurate recognition of known regions. Next, a stage of general segmentation provides the segmentation of unknown regions by a bottom-strategy. Finally, the last stage tries to perform a region fusion of known and unknown segmented objects. The result is both a segmentation of the image and a recognition of each segment as a given object class or as an unknown segmented object. Furthermore, experimental results are shown and evaluated to prove the validity of our proposal
Resumo:
We propose a probabilistic object classifier for outdoor scene analysis as a first step in solving the problem of scene context generation. The method begins with a top-down control, which uses the previously learned models (appearance and absolute location) to obtain an initial pixel-level classification. This information provides us the core of objects, which is used to acquire a more accurate object model. Therefore, their growing by specific active regions allows us to obtain an accurate recognition of known regions. Next, a stage of general segmentation provides the segmentation of unknown regions by a bottom-strategy. Finally, the last stage tries to perform a region fusion of known and unknown segmented objects. The result is both a segmentation of the image and a recognition of each segment as a given object class or as an unknown segmented object. Furthermore, experimental results are shown and evaluated to prove the validity of our proposal
Resumo:
Photo-mosaicing techniques have become popular for seafloor mapping in various marine science applications. However, the common methods cannot accurately map regions with high relief and topographical variations. Ortho-mosaicing borrowed from photogrammetry is an alternative technique that enables taking into account the 3-D shape of the terrain. A serious bottleneck is the volume of elevation information that needs to be estimated from the video data, fused, and processed for the generation of a composite ortho-photo that covers a relatively large seafloor area. We present a framework that combines the advantages of dense depth-map and 3-D feature estimation techniques based on visual motion cues. The main goal is to identify and reconstruct certain key terrain feature points that adequately represent the surface with minimal complexity in the form of piecewise planar patches. The proposed implementation utilizes local depth maps for feature selection, while tracking over several views enables 3-D reconstruction by bundle adjustment. Experimental results with synthetic and real data validate the effectiveness of the proposed approach
Resumo:
Voluntary control of information processing is crucial to allocate resources and prioritize the processes that are most important under a given situation; the algorithms underlying such control, however, are often not clear. We investigated possible algorithms of control for the performance of the majority function, in which participants searched for and identified one of two alternative categories (left or right pointing arrows) as composing the majority in each stimulus set. We manipulated the amount (set size of 1, 3, and 5) and content (ratio of left and right pointing arrows within a set) of the inputs to test competing hypotheses regarding mental operations for information processing. Using a novel measure based on computational load, we found that reaction time was best predicted by a grouping search algorithm as compared to alternative algorithms (i.e., exhaustive or self-terminating search). The grouping search algorithm involves sampling and resampling of the inputs before a decision is reached. These findings highlight the importance of investigating the implications of voluntary control via algorithms of mental operations.
Resumo:
BACKGROUND: A key aspect of representations for object recognition and scene analysis in the ventral visual stream is the spatial frame of reference, be it a viewer-centered, object-centered, or scene-based coordinate system. Coordinate transforms from retinocentric space to other reference frames involve combining neural visual responses with extraretinal postural information. METHODOLOGY/PRINCIPAL FINDINGS: We examined whether such spatial information is available to anterior inferotemporal (AIT) neurons in the macaque monkey by measuring the effect of eye position on responses to a set of simple 2D shapes. We report, for the first time, a significant eye position effect in over 40% of recorded neurons with small gaze angle shifts from central fixation. Although eye position modulates responses, it does not change shape selectivity. CONCLUSIONS/SIGNIFICANCE: These data demonstrate that spatial information is available in AIT for the representation of objects and scenes within a non-retinocentric frame of reference. More generally, the availability of spatial information in AIT calls into questions the classic dichotomy in visual processing that associates object shape processing with ventral structures such as AIT but places spatial processing in a separate anatomical stream projecting to dorsal structures.
Resumo:
We seek to determine the relationship between threshold and suprathreshold perception for position offset and stereoscopic depth perception under conditions that elevate their respective thresholds. Two threshold-elevating conditions were used: (1) increasing the interline gap and (2) dioptric blur. Although increasing the interline gap increases position (Vernier) offset and stereoscopic disparity thresholds substantially, the perception of suprathreshold position offset and stereoscopic depth remains unchanged. Perception of suprathreshold position offset also remains unchanged when the Vernier threshold is elevated by dioptric blur. We show that such normalization of suprathreshold position offset can be attributed to the topographical-map-based encoding of position. On the other hand, dioptric blur increases the stereoscopic disparity thresholds and reduces the perceived suprathreshold stereoscopic depth, which can be accounted for by a disparity-computation model in which the activities of absolute disparity encoders are multiplied by a Gaussian weighting function that is centered on the horopter. Overall, the statement "equal suprathreshold perception occurs in threshold-elevated and unelevated conditions when the stimuli are equally above their corresponding thresholds" describes the results better than the statement "suprathreshold stimuli are perceived as equal when they are equal multiples of their respective threshold values."
Resumo:
It is widely accepted that infants begin learning their native language not by learning words, but by discovering features of the speech signal: consonants, vowels, and combinations of these sounds. Learning to understand words, as opposed to just perceiving their sounds, is said to come later, between 9 and 15 mo of age, when infants develop a capacity for interpreting others' goals and intentions. Here, we demonstrate that this consensus about the developmental sequence of human language learning is flawed: in fact, infants already know the meanings of several common words from the age of 6 mo onward. We presented 6- to 9-mo-old infants with sets of pictures to view while their parent named a picture in each set. Over this entire age range, infants directed their gaze to the named pictures, indicating their understanding of spoken words. Because the words were not trained in the laboratory, the results show that even young infants learn ordinary words through daily experience with language. This surprising accomplishment indicates that, contrary to prevailing beliefs, either infants can already grasp the referential intentions of adults at 6 mo or infants can learn words before this ability emerges. The precocious discovery of word meanings suggests a perspective in which learning vocabulary and learning the sound structure of spoken language go hand in hand as language acquisition begins.
Resumo:
The present study investigates human visual processing of simple two-colour patterns using a delayed match to sample paradigm with positron emission tomography (PET). This study is unique in that we specifically designed the visual stimuli to be the same for both pattern and colour recognition with all patterns being abstract shapes not easily verbally coded composed of two-colour combinations. We did this to explore those brain regions required for both colour and pattern processing and to separate those areas of activation required for one or the other. We found that both tasks activated similar occipital regions, the major difference being more extensive activation in pattern recognition. A right-sided network that involved the inferior parietal lobule, the head of the caudate nucleus, and the pulvinar nucleus of the thalamus was common to both paradigms. Pattern recognition also activated the left temporal pole and right lateral orbital gyrus, whereas colour recognition activated the left fusiform gyrus and several right frontal regions. (C) 2001 Wiley-Liss, Inc.
Resumo:
The robotics community is concerned with the ability to infer and compare the results from researchers in areas such as vision perception and multi-robot cooperative behavior. To accomplish that task, this paper proposes a real-time indoor visual ground truth system capable of providing accuracy with at least more magnitude than the precision of the algorithm to be evaluated. A multi-camera architecture is proposed under the ROS (Robot Operating System) framework to estimate the 3D position of objects and the implementation and results were contextualized to the Robocup Middle Size League scenario.
Resumo:
Positioning a robot with respect to objects by using data provided by a camera is a well known technique called visual servoing. In order to perform a task, the object must exhibit visual features which can be extracted from different points of view. Then, visual servoing is object-dependent as it depends on the object appearance. Therefore, performing the positioning task is not possible in presence of nontextured objets or objets for which extracting visual features is too complex or too costly. This paper proposes a solution to tackle this limitation inherent to the current visual servoing techniques. Our proposal is based on the coded structured light approach as a reliable and fast way to solve the correspondence problem. In this case, a coded light pattern is projected providing robust visual features independently of the object appearance