750 resultados para Object vision


Relevância:

60.00% 60.00%

Publicador:

Resumo:

Working memory is the process of actively maintaining a representation of information for a brief period of time so that it is available for use. In monkeys, visual working memory involves the concerted activity of a distributed neural system, including posterior areas in visual cortex and anterior areas in prefrontal cortex. Within visual cortex, ventral stream areas are selectively involved in object vision, whereas dorsal stream areas are selectively involved in spatial vision. This domain specificity appears to extend forward into prefrontal cortex, with ventrolateral areas involved mainly in working memory for objects and dorsolateral areas involved mainly in working memory for spatial locations. The organization of this distributed neural system for working memory in monkeys appears to be conserved in humans, though some differences between the two species exist. In humans, as compared with monkeys, areas specialized for object vision in the ventral stream have a more inferior location in temporal cortex, whereas areas specialized for spatial vision in the dorsal stream have a more superior location in parietal cortex. Displacement of both sets of visual areas away from the posterior perisylvian cortex may be related to the emergence of language over the course of brain evolution. Whereas areas specialized for object working memory in humans and monkeys are similarly located in ventrolateral prefrontal cortex, those specialized for spatial working memory occupy a more superior and posterior location within dorsal prefrontal cortex in humans than in monkeys. As in posterior cortex, this displacement in frontal cortex also may be related to the emergence of new areas to serve distinctively human cognitive abilities.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Considerable evidence exists to support the hypothesis that the hippocampus and related medial temporal lobe structures are crucial for the encoding and storage of information in long-term memory. Few human imaging studies, however, have successfully shown signal intensity changes in these areas during encoding or retrieval. Using functional magnetic resonance imaging (fMRI), we studied normal human subjects while they performed a novel picture encoding task. High-speed echo-planar imaging techniques evaluated fMRI signal changes throughout the brain. During the encoding of novel pictures, statistically significant increases in fMRI signal were observed bilaterally in the posterior hippocampal formation and parahippocampal gyrus and in the lingual and fusiform gyri. To our knowledge, this experiment is the first fMRI study to show robust signal changes in the human hippocampal region. It also provides evidence that the encoding of novel, complex pictures depends upon an interaction between ventral cortical regions, specialized for object vision, and the hippocampal formation and parahippocampal gyrus, specialized for long-term memory.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Disruptive colouration is a visual camouflage composed of false edges and boundaries. Many disruptively camouflaged animals feature enhanced edges; light patches are surrounded by a lighter outline and/or a dark patches are surrounded by a darker outline. This camouflage is particularly common in amphibians, reptiles and lepidopterans. We explored the role that this pattern has in creating effective camouflage. In a visual search task utilising an ultra-large display area mimicking search tasks that might be found in nature, edge enhanced disruptive camouflage increases crypsis, even on substrates that do not provide an obvious visual match. Specifically, edge enhanced camouflage is effective on backgrounds both with and without shadows; i.e. this is not solely due to background matching of the dark edge enhancement element with the shadows. Furthermore, when the dark component of the edge enhancement is omitted the camouflage still provided better crypsis than control patterns without edge enhancement. This kind of edge enhancement improved camouflage on all background types. Lastly, we show that edge enhancement can create a perception of multiple surfaces. We conclude that edge enhancement increases the effectiveness of disruptive camouflage through mechanisms that may include the improved disruption of the object outline by implying pictorial relief.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The integration of CMOS cameras with embedded processors and wireless communication devices has enabled the development of distributed wireless vision systems. Wireless Vision Sensor Networks (WVSNs), which consist of wirelessly connected embedded systems with vision and sensing capabilities, provide wide variety of application areas that have not been possible to realize with the wall-powered vision systems with wired links or scalar-data based wireless sensor networks. In this paper, the design of a middleware for a wireless vision sensor node is presented for the realization of WVSNs. The implemented wireless vision sensor node is tested through a simple vision application to study and analyze its capabilities, and determine the challenges in distributed vision applications through a wireless network of low-power embedded devices. The results of this paper highlight the practical concerns for the development of efficient image processing and communication solutions for WVSNs and emphasize the need for cross-layer solutions that unify these two so-far-independent research areas.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Tesis en inglés. Eliminadas las páginas en blanco del pdf

Relevância:

40.00% 40.00%

Publicador:

Resumo:

In recent years, Deep Learning techniques have shown to perform well on a large variety of problems both in Computer Vision and Natural Language Processing, reaching and often surpassing the state of the art on many tasks. The rise of deep learning is also revolutionizing the entire field of Machine Learning and Pattern Recognition pushing forward the concepts of automatic feature extraction and unsupervised learning in general. However, despite the strong success both in science and business, deep learning has its own limitations. It is often questioned if such techniques are only some kind of brute-force statistical approaches and if they can only work in the context of High Performance Computing with tons of data. Another important question is whether they are really biologically inspired, as claimed in certain cases, and if they can scale well in terms of "intelligence". The dissertation is focused on trying to answer these key questions in the context of Computer Vision and, in particular, Object Recognition, a task that has been heavily revolutionized by recent advances in the field. Practically speaking, these answers are based on an exhaustive comparison between two, very different, deep learning techniques on the aforementioned task: Convolutional Neural Network (CNN) and Hierarchical Temporal memory (HTM). They stand for two different approaches and points of view within the big hat of deep learning and are the best choices to understand and point out strengths and weaknesses of each of them. CNN is considered one of the most classic and powerful supervised methods used today in machine learning and pattern recognition, especially in object recognition. CNNs are well received and accepted by the scientific community and are already deployed in large corporation like Google and Facebook for solving face recognition and image auto-tagging problems. HTM, on the other hand, is known as a new emerging paradigm and a new meanly-unsupervised method, that is more biologically inspired. It tries to gain more insights from the computational neuroscience community in order to incorporate concepts like time, context and attention during the learning process which are typical of the human brain. In the end, the thesis is supposed to prove that in certain cases, with a lower quantity of data, HTM can outperform CNN.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

In sports games, it is often necessary to perceive a large number of moving objects (e.g., the ball and players). In this context, the role of peripheral vision for processing motion information in the periphery is often discussed especially when motor responses are required. In an attempt to test the basal functionality of peripheral vision in those sports-games situations, a Multiple Object Tracking (MOT) task that requires to track a certain number of targets amidst distractors, was chosen. Participants’ primary task was to recall four targets (out of 10 rectangular stimuli) after six seconds of quasi-random motion. As a second task, a button had to be pressed if a target change occurred (Exp 1: stop vs. form change to a diamond for 0.5 s; Exp 2: stop vs. slowdown for 0.5 s). While eccentricities of changes (5-10° vs. 15-20°) were manipulated, decision accuracy (recall and button press correct), motor response time as well as saccadic reaction time were calculated as dependent variables. Results show that participants indeed used peripheral vision to detect changes, because either no or very late saccades to the changed target were executed in correct trials. Moreover, a saccade was more often executed when eccentricities were small. Response accuracies were higher and response times were lower in the stop conditions of both experiments while larger eccentricities led to higher response times in all conditions. Summing up, it could be shown that monitoring targets and detecting changes can be processed by peripheral vision only and that a monitoring strategy on the basis of peripheral vision may be the optimal one as saccades may be afflicted with certain costs. Further research is planned to address the question whether this functionality is also evident in sports tasks.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

In sports games, it is often necessary to perceive a large number of moving objects (e.g., the ball and players). In this context, the role of peripheral vision for processing motion information in the periphery is often discussed especially when motor responses are required. In an attempt to test the capability of using peripheral vision in those sports-games situations, a Multiple-Object-Tracking task that requires to track a certain number of targets amidst distractors, was chosen to determine the sensitivity of detecting target changes with peripheral vision only. Participants’ primary task was to recall four targets (out of 10 rectangular stimuli) after six seconds of quasi-random motion. As a second task, a button had to be pressed if a target change occurred (Exp 1: stop vs. form change to a diamond for 0.5 s; Exp 2: stop vs. slowdown for 0.5 s). Eccentricities of changes (5-10° vs. 15-20°) were manipulated, decision accuracy (recall and button press correct), motor response time and saccadic reaction time (change onset to saccade onset) were calculated and eye-movements were recorded. Results show that participants indeed used peripheral vision to detect changes, because either no or very late saccades to the changed target were executed in correct trials. Moreover, a saccade was more often executed when eccentricities were small. Response accuracies were higher and response times were lower in the stop conditions of both experiments while larger eccentricities led to higher response times in all conditions. Summing up, it could be shown that monitoring targets and detecting changes can be processed by peripheral vision only and that a monitoring strategy on the basis of peripheral vision may be the optimal one as saccades may be afflicted with certain costs. Further research is planned to address the question whether this functionality is also evident in sports tasks.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

In the current study it is investigated whether peripheral vision can be used to monitor multi-ple moving objects and to detect single-target changes. For this purpose, in Experiment 1, a modified MOT setup with a large projection and a constant-position centroid phase had to be checked first. Classical findings regarding the use of a virtual centroid to track multiple ob-jects and the dependency of tracking accuracy on target speed could be successfully replicat-ed. Thereafter, the main experimental variations regarding the manipulation of to-be-detected target changes could be introduced in Experiment 2. In addition to a button press used for the detection task, gaze behavior was assessed using an integrated eye-tracking system. The anal-ysis of saccadic reaction times in relation to the motor response shows that peripheral vision is naturally used to detect motion and form changes in MOT because the saccade to the target occurred after target-change offset. Furthermore, for changes of comparable task difficulties, motion changes are detected better by peripheral vision than form changes. Findings indicate that capabilities of the visual system (e.g., visual acuity) affect change detection rates and that covert-attention processes may be affected by vision-related aspects like spatial uncertainty. Moreover, it is argued that a centroid-MOT strategy might reduce the amount of saccade-related costs and that eye-tracking seems to be generally valuable to test predictions derived from theories on MOT. Finally, implications for testing covert attention in applied settings are proposed.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The perception of an object as a single entity within a visual scene requires that its features are bound together and segregated from the background and/or other objects. Here, we used magnetoencephalography (MEG) to assess the hypothesis that coherent percepts may arise from the synchronized high frequency (gamma) activity between neurons that code features of the same object. We also assessed the role of low frequency (alpha, beta) activity in object processing. The target stimulus (i.e. object) was a small patch of a concentric grating of 3c/°, viewed eccentrically. The background stimulus was either a blank field or a concentric grating of 3c/° periodicity, viewed centrally. With patterned backgrounds, the target stimulus emerged--through rotation about its own centre--as a circular subsection of the background. Data were acquired using a 275-channel whole-head MEG system and analyzed using Synthetic Aperture Magnetometry (SAM), which allows one to generate images of task-related cortical oscillatory power changes within specific frequency bands. Significant oscillatory activity across a broad range of frequencies was evident at the V1/V2 border, and subsequent analyses were based on a virtual electrode at this location. When the target was presented in isolation, we observed that: (i) contralateral stimulation yielded a sustained power increase in gamma activity; and (ii) both contra- and ipsilateral stimulation yielded near identical transient power changes in alpha (and beta) activity. When the target was presented against a patterned background, we observed that: (i) contralateral stimulation yielded an increase in high-gamma (>55 Hz) power together with a decrease in low-gamma (40-55 Hz) power; and (ii) both contra- and ipsilateral stimulation yielded a transient decrease in alpha (and beta) activity, though the reduction tended to be greatest for contralateral stimulation. The opposing power changes across different regions of the gamma spectrum with 'figure/ground' stimulation suggest a possible dual role for gamma rhythms in visual object coding, and provide general support of the binding-by-synchronization hypothesis. As the power changes in alpha and beta activity were largely independent of the spatial location of the target, however, we conclude that their role in object processing may relate principally to changes in visual attention.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Visual pigments, the molecules in photoreceptors that initiate the process of vision, are inherently dichroic, differentially absorbing light according to its axis of polarization. Many animals have taken advantage of this property to build receptor systems capable of analyzing the polarization of incoming light, as polarized light is abundant in natural scenes (commonly being produced by scattering or reflection). Such polarization sensitivity has long been associated with behavioral tasks like orientation or navigation. However, only recently have we become aware that it can be incorporated into a high-level visual perception akin to color vision, permitting segmentation of a viewed scene into regions that differ in their polarization. By analogy to color vision, we call this capacity polarization vision. It is apparently used for tasks like those that color vision specializes in: contrast enhancement, camouflage breaking, object recognition, and signal detection and discrimination. While color is very useful in terrestrial or shallow-water environments, it is an unreliable cue deeper in water due to the spectral modification of light as it travels through water of various depths or of varying optical quality. Here, polarization vision has special utility and consequently has evolved in numerous marine species, as well as at least one terrestrial animal. In this review, we consider recent findings concerning polarization vision and its significance in biological signaling.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Dissertation presented at the Faculdade de Ciências e Tecnologia da Universidade Nova de Lisboa to obtain the Master degree in Electrical and Computer Engineering.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Dissertação para obtenção do Grau de Mestre em Engenharia Informática

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Nowadays, existing 3D scanning cameras and microscopes in the market use digital or discrete sensors, such as CCDs or CMOS for object detection applications. However, these combined systems are not fast enough for some application scenarios since they require large data processing resources and can be cumbersome. Thereby, there is a clear interest in exploring the possibilities and performances of analogue sensors such as arrays of position sensitive detectors with the final goal of integrating them in 3D scanning cameras or microscopes for object detection purposes. The work performed in this thesis deals with the implementation of prototype systems in order to explore the application of object detection using amorphous silicon position sensors of 32 and 128 lines which were produced in the clean room at CENIMAT-CEMOP. During the first phase of this work, the fabrication and the study of the static and dynamic specifications of the sensors as well as their conditioning in relation to the existing scientific and technological knowledge became a starting point. Subsequently, relevant data acquisition and suitable signal processing electronics were assembled. Various prototypes were developed for the 32 and 128 array PSD sensors. Appropriate optical solutions were integrated to work together with the constructed prototypes, allowing the required experiments to be carried out and allowing the achievement of the results presented in this thesis. All control, data acquisition and 3D rendering platform software was implemented for the existing systems. All these components were combined together to form several integrated systems for the 32 and 128 line PSD 3D sensors. The performance of the 32 PSD array sensor and system was evaluated for machine vision applications such as for example 3D object rendering as well as for microscopy applications such as for example micro object movement detection. Trials were also performed involving the 128 array PSD sensor systems. Sensor channel non-linearities of approximately 4 to 7% were obtained. Overall results obtained show the possibility of using a linear array of 32/128 1D line sensors based on the amorphous silicon technology to render 3D profiles of objects. The system and setup presented allows 3D rendering at high speeds and at high frame rates. The minimum detail or gap that can be detected by the sensor system is approximately 350 μm when using this current setup. It is also possible to render an object in 3D within a scanning angle range of 15º to 85º and identify its real height as a function of the scanning angle and the image displacement distance on the sensor. Simple and not so simple objects, such as a rubber and a plastic fork, can be rendered in 3D properly and accurately also at high resolution, using this sensor and system platform. The nip structure sensor system can detect primary and even derived colors of objects by a proper adjustment of the integration time of the system and by combining white, red, green and blue (RGB) light sources. A mean colorimetric error of 25.7 was obtained. It is also possible to detect the movement of micrometer objects using the 32 PSD sensor system. This kind of setup offers the possibility to detect if a micro object is moving, what are its dimensions and what is its position in two dimensions, even at high speeds. Results show a non-linearity of about 3% and a spatial resolution of < 2µm.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Report for the scientific sojourn at the Swiss Federal Institute of Technology Zurich, Switzerland, between September and December 2007. In order to make robots useful assistants for our everyday life, the ability to learn and recognize objects is of essential importance. However, object recognition in real scenes is one of the most challenging problems in computer vision, as it is necessary to deal with difficulties. Furthermore, in mobile robotics a new challenge is added to the list: computational complexity. In a dynamic world, information about the objects in the scene can become obsolete before it is ready to be used if the detection algorithm is not fast enough. Two recent object recognition techniques have achieved notable results: the constellation approach proposed by Lowe and the bag of words approach proposed by Nistér and Stewénius. The Lowe constellation approach is the one currently being used in the robot localization project of the COGNIRON project. This report is divided in two main sections. The first section is devoted to briefly review the currently used object recognition system, the Lowe approach, and bring to light the drawbacks found for object recognition in the context of indoor mobile robot navigation. Additionally the proposed improvements for the algorithm are described. In the second section the alternative bag of words method is reviewed, as well as several experiments conducted to evaluate its performance with our own object databases. Furthermore, some modifications to the original algorithm to make it suitable for object detection in unsegmented images are proposed.