939 resultados para visual object detection


Relevância:

30.00% 30.00%

Publicador:

Resumo:

One of the most challenging problems that must be solved by any theoretical model purporting to explain the competence of the human brain for relational tasks is the one related with the analysis and representation of the internal structure in an extended spatial layout of múltiple objects. In this way, some of the problems are related with specific aims as how can we extract and represent spatial relationships among objects, how can we represent the movement of a selected object and so on. The main objective of this paper is the study of some plausible brain structures that can provide answers in these problems. Moreover, in order to achieve a more concrete knowledge, our study will be focused on the response of the retinal layers for optical information processing and how this information can be processed in the first cortex layers. The model to be reported is just a first trial and some major additions are needed to complete the whole vision process.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

As it is known, there are five types of neurons in the mammalian retinal layer allowing the detection of several important characteristics of the visual image impinging onto the visual system, namely, photoreceptors, horizontal cells, amacrine, bipolar and ganglion cells. And it is a well known fact too, that the amacrine neuron architecture allows a first detection for objects motion, being the most important retinal cell to this function. We have already studied and simulated the Dowling retina model and we have verified that many complex processes in visual detection is performed with the basis of the amacrine cell synaptic connections. This work will show how this structure may be employed for motion detection

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Los sistemas de seguimiento mono-cámara han demostrado su notable capacidad para el análisis de trajectorias de objectos móviles y para monitorización de escenas de interés; sin embargo, tanto su robustez como sus posibilidades en cuanto a comprensión semántica de la escena están fuertemente limitadas por su naturaleza local y monocular, lo que los hace insuficientes para aplicaciones realistas de videovigilancia. El objetivo de esta tesis es la extensión de las posibilidades de los sistemas de seguimiento de objetos móviles para lograr un mayor grado de robustez y comprensión de la escena. La extensión propuesta se divide en dos direcciones separadas. La primera puede considerarse local, ya que está orientada a la mejora y enriquecimiento de las posiciones estimadas para los objetos móviles observados directamente por las cámaras del sistema; dicha extensión se logra mediante el desarrollo de un sistema multi-cámara de seguimiento 3D, capaz de proporcionar consistentemente las posiciones 3D de múltiples objetos a partir de las observaciones capturadas por un conjunto de sensores calibrados y con campos de visión solapados. La segunda extensión puede considerarse global, dado que su objetivo consiste en proporcionar un contexto global para relacionar las observaciones locales realizadas por una cámara con una escena de mucho mayor tamaño; para ello se propone un sistema automático de localización de cámaras basado en las trayectorias observadas de varios objetos móviles y en un mapa esquemático de la escena global monitorizada. Ambas líneas de investigación se tratan utilizando, como marco común, técnicas de estimación bayesiana: esta elección está justificada por la versatilidad y flexibilidad proporcionada por dicho marco estadístico, que permite la combinación natural de múltiples fuentes de información sobre los parámetros a estimar, así como un tratamiento riguroso de la incertidumbre asociada a las mismas mediante la inclusión de modelos de observación específicamente diseñados. Además, el marco seleccionado abre grandes posibilidades operacionales, puesto que permite la creación de diferentes métodos numéricos adaptados a las necesidades y características específicas de distintos problemas tratados. El sistema de seguimiento 3D con múltiples cámaras propuesto está específicamente diseñado para permitir descripciones esquemáticas de las medidas realizadas individualmente por cada una de las cámaras del sistema: esta elección de diseño, por tanto, no asume ningún algoritmo específico de detección o seguimiento 2D en ninguno de los sensores de la red, y hace que el sistema propuesto sea aplicable a redes reales de vigilancia con capacidades limitadas tanto en términos de procesamiento como de transmision. La combinación robusta de las observaciones capturadas individualmente por las cámaras, ruidosas, incompletas y probablemente contaminadas por falsas detecciones, se basa en un metodo de asociación bayesiana basado en geometría y color: los resultados de dicha asociación permiten el seguimiento 3D de los objetos de la escena mediante el uso de un filtro de partículas. El sistema de fusión de observaciones propuesto tiene, como principales características, una gran precisión en términos de localización 3D de objetos, y una destacable capacidad de recuperación tras eventuales errores debidos a un número insuficiente de datos de entrada. El sistema automático de localización de cámaras se basa en la observación de múltiples objetos móviles y un mapa esquemático de las áreas transitables del entorno monitorizado para inferir la posición absoluta de dicho sensor. Para este propósito, se propone un novedoso marco bayesiano que combina modelos dinámicos inducidos por el mapa en los objetos móviles presentes en la escena con las trayectorias observadas por la cámara, lo que representa un enfoque nunca utilizado en la literatura existente. El sistema de localización se divide en dos sub-tareas diferenciadas, debido a que cada una de estas tareas requiere del diseño de algoritmos específicos de muestreo para explotar en profundidad las características del marco desarrollado: por un lado, análisis de la ambigüedad del caso específicamente tratado y estimación aproximada de la localización de la cámara, y por otro, refinado de la localización de la cámara. El sistema completo, diseñado y probado para el caso específico de localización de cámaras en entornos de tráfico urbano, podría tener aplicación también en otros entornos y sensores de diferentes modalidades tras ciertas adaptaciones. ABSTRACT Mono-camera tracking systems have proved their capabilities for moving object trajectory analysis and scene monitoring, but their robustness and semantic possibilities are strongly limited by their local and monocular nature and are often insufficient for realistic surveillance applications. This thesis is aimed at extending the possibilities of moving object tracking systems to a higher level of scene understanding. The proposed extension comprises two separate directions. The first one is local, since is aimed at enriching the inferred positions of the moving objects within the area of the monitored scene directly covered by the cameras of the system; this task is achieved through the development of a multi-camera system for robust 3D tracking, able to provide 3D tracking information of multiple simultaneous moving objects from the observations reported by a set of calibrated cameras with semi-overlapping fields of view. The second extension is global, as is aimed at providing local observations performed within the field of view of one camera with a global context relating them to a much larger scene; to this end, an automatic camera positioning system relying only on observed object trajectories and a scene map is designed. The two lines of research in this thesis are addressed using Bayesian estimation as a general unifying framework. Its suitability for these two applications is justified by the flexibility and versatility of that stochastic framework, which allows the combination of multiple sources of information about the parameters to estimate in a natural and elegant way, addressing at the same time the uncertainty associated to those sources through the inclusion of models designed to this end. In addition, it opens multiple possibilities for the creation of different numerical methods for achieving satisfactory and efficient practical solutions to each addressed application. The proposed multi-camera 3D tracking method is specifically designed to work on schematic descriptions of the observations performed by each camera of the system: this choice allows the use of unspecific off-the-shelf 2D detection and/or tracking subsystems running independently at each sensor, and makes the proposal suitable for real surveillance networks with moderate computational and transmission capabilities. The robust combination of such noisy, incomplete and possibly unreliable schematic descriptors relies on a Bayesian association method, based on geometry and color, whose results allow the tracking of the targets in the scene with a particle filter. The main features exhibited by the proposal are, first, a remarkable accuracy in terms of target 3D positioning, and second, a great recovery ability after tracking losses due to insufficient input data. The proposed system for visual-based camera self-positioning uses the observations of moving objects and a schematic map of the passable areas of the environment to infer the absolute sensor position. To this end, a new Bayesian framework combining trajectory observations and map-induced dynamic models for moving objects is designed, which represents an approach to camera positioning never addressed before in the literature. This task is divided into two different sub-tasks, setting ambiguity analysis and approximate position estimation, on the one hand, and position refining, on the other, since they require the design of specific sampling algorithms to correctly exploit the discriminative features of the developed framework. This system, designed for camera positioning and demonstrated in urban traffic environments, can also be applied to different environments and sensors of other modalities after certain required adaptations.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

OboMind es un programa educativo utilizado en todos los niveles educativos, desde el colegio hasta la universidad. Este programa simula un robot que se desplaza a través de un mapa. Este proyecto surge de la necesidad de ampliar ciertas funcionalidades de dicho programa. Para la realización del mismo se han utilizado las tecnologías proporcionadas por Java, utilizando como base el código fuente de libre distribución. Este proyecto cuenta con partes de diseño y partes de implementación, en la que se ha utilizado metodologías orientadas a objetos. ---ABSTRACT---RoboMind is an educational programming environment used in all academic disciplines from primary school to college. This application simulates a robot that can move around a world. This project comes from the necessity of extending certain functionalities of it. The technologies used for developing has been those provided by the Java framework, using the free program sources as support for the project. The project has two parts, one design part and another, implementation part, in which object oriented technologies had been used.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Los sistemas de seguimiento mono-cámara han demostrado su notable capacidad para el análisis de trajectorias de objectos móviles y para monitorización de escenas de interés; sin embargo, tanto su robustez como sus posibilidades en cuanto a comprensión semántica de la escena están fuertemente limitadas por su naturaleza local y monocular, lo que los hace insuficientes para aplicaciones realistas de videovigilancia. El objetivo de esta tesis es la extensión de las posibilidades de los sistemas de seguimiento de objetos móviles para lograr un mayor grado de robustez y comprensión de la escena. La extensión propuesta se divide en dos direcciones separadas. La primera puede considerarse local, ya que está orientada a la mejora y enriquecimiento de las posiciones estimadas para los objetos móviles observados directamente por las cámaras del sistema; dicha extensión se logra mediante el desarrollo de un sistema multi-cámara de seguimiento 3D, capaz de proporcionar consistentemente las posiciones 3D de múltiples objetos a partir de las observaciones capturadas por un conjunto de sensores calibrados y con campos de visión solapados. La segunda extensión puede considerarse global, dado que su objetivo consiste en proporcionar un contexto global para relacionar las observaciones locales realizadas por una cámara con una escena de mucho mayor tamaño; para ello se propone un sistema automático de localización de cámaras basado en las trayectorias observadas de varios objetos móviles y en un mapa esquemático de la escena global monitorizada. Ambas líneas de investigación se tratan utilizando, como marco común, técnicas de estimación bayesiana: esta elección está justificada por la versatilidad y flexibilidad proporcionada por dicho marco estadístico, que permite la combinación natural de múltiples fuentes de información sobre los parámetros a estimar, así como un tratamiento riguroso de la incertidumbre asociada a las mismas mediante la inclusión de modelos de observación específicamente diseñados. Además, el marco seleccionado abre grandes posibilidades operacionales, puesto que permite la creación de diferentes métodos numéricos adaptados a las necesidades y características específicas de distintos problemas tratados. El sistema de seguimiento 3D con múltiples cámaras propuesto está específicamente diseñado para permitir descripciones esquemáticas de las medidas realizadas individualmente por cada una de las cámaras del sistema: esta elección de diseño, por tanto, no asume ningún algoritmo específico de detección o seguimiento 2D en ninguno de los sensores de la red, y hace que el sistema propuesto sea aplicable a redes reales de vigilancia con capacidades limitadas tanto en términos de procesamiento como de transmision. La combinación robusta de las observaciones capturadas individualmente por las cámaras, ruidosas, incompletas y probablemente contaminadas por falsas detecciones, se basa en un metodo de asociación bayesiana basado en geometría y color: los resultados de dicha asociación permiten el seguimiento 3D de los objetos de la escena mediante el uso de un filtro de partículas. El sistema de fusión de observaciones propuesto tiene, como principales características, una gran precisión en términos de localización 3D de objetos, y una destacable capacidad de recuperación tras eventuales errores debidos a un número insuficiente de datos de entrada. El sistema automático de localización de cámaras se basa en la observación de múltiples objetos móviles y un mapa esquemático de las áreas transitables del entorno monitorizado para inferir la posición absoluta de dicho sensor. Para este propósito, se propone un novedoso marco bayesiano que combina modelos dinámicos inducidos por el mapa en los objetos móviles presentes en la escena con las trayectorias observadas por la cámara, lo que representa un enfoque nunca utilizado en la literatura existente. El sistema de localización se divide en dos sub-tareas diferenciadas, debido a que cada una de estas tareas requiere del diseño de algoritmos específicos de muestreo para explotar en profundidad las características del marco desarrollado: por un lado, análisis de la ambigüedad del caso específicamente tratado y estimación aproximada de la localización de la cámara, y por otro, refinado de la localización de la cámara. El sistema completo, diseñado y probado para el caso específico de localización de cámaras en entornos de tráfico urbano, podría tener aplicación también en otros entornos y sensores de diferentes modalidades tras ciertas adaptaciones. ABSTRACT Mono-camera tracking systems have proved their capabilities for moving object trajectory analysis and scene monitoring, but their robustness and semantic possibilities are strongly limited by their local and monocular nature and are often insufficient for realistic surveillance applications. This thesis is aimed at extending the possibilities of moving object tracking systems to a higher level of scene understanding. The proposed extension comprises two separate directions. The first one is local, since is aimed at enriching the inferred positions of the moving objects within the area of the monitored scene directly covered by the cameras of the system; this task is achieved through the development of a multi-camera system for robust 3D tracking, able to provide 3D tracking information of multiple simultaneous moving objects from the observations reported by a set of calibrated cameras with semi-overlapping fields of view. The second extension is global, as is aimed at providing local observations performed within the field of view of one camera with a global context relating them to a much larger scene; to this end, an automatic camera positioning system relying only on observed object trajectories and a scene map is designed. The two lines of research in this thesis are addressed using Bayesian estimation as a general unifying framework. Its suitability for these two applications is justified by the flexibility and versatility of that stochastic framework, which allows the combination of multiple sources of information about the parameters to estimate in a natural and elegant way, addressing at the same time the uncertainty associated to those sources through the inclusion of models designed to this end. In addition, it opens multiple possibilities for the creation of different numerical methods for achieving satisfactory and efficient practical solutions to each addressed application. The proposed multi-camera 3D tracking method is specifically designed to work on schematic descriptions of the observations performed by each camera of the system: this choice allows the use of unspecific off-the-shelf 2D detection and/or tracking subsystems running independently at each sensor, and makes the proposal suitable for real surveillance networks with moderate computational and transmission capabilities. The robust combination of such noisy, incomplete and possibly unreliable schematic descriptors relies on a Bayesian association method, based on geometry and color, whose results allow the tracking of the targets in the scene with a particle filter. The main features exhibited by the proposal are, first, a remarkable accuracy in terms of target 3D positioning, and second, a great recovery ability after tracking losses due to insufficient input data. The proposed system for visual-based camera self-positioning uses the observations of moving objects and a schematic map of the passable areas of the environment to infer the absolute sensor position. To this end, a new Bayesian framework combining trajectory observations and map-induced dynamic models for moving objects is designed, which represents an approach to camera positioning never addressed before in the literature. This task is divided into two different sub-tasks, setting ambiguity analysis and approximate position estimation, on the one hand, and position refining, on the other, since they require the design of specific sampling algorithms to correctly exploit the discriminative features of the developed framework. This system, designed for camera positioning and demonstrated in urban traffic environments, can also be applied to different environments and sensors of other modalities after certain required adaptations.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A more natural, intuitive, user-friendly, and less intrusive Human–Computer interface for controlling an application by executing hand gestures is presented. For this purpose, a robust vision-based hand-gesture recognition system has been developed, and a new database has been created to test it. The system is divided into three stages: detection, tracking, and recognition. The detection stage searches in every frame of a video sequence potential hand poses using a binary Support Vector Machine classifier and Local Binary Patterns as feature vectors. These detections are employed as input of a tracker to generate a spatio-temporal trajectory of hand poses. Finally, the recognition stage segments a spatio-temporal volume of data using the obtained trajectories, and compute a video descriptor called Volumetric Spatiograms of Local Binary Patterns (VS-LBP), which is delivered to a bank of SVM classifiers to perform the gesture recognition. The VS-LBP is a novel video descriptor that constitutes one of the most important contributions of the paper, which is able to provide much richer spatio-temporal information than other existing approaches in the state of the art with a manageable computational cost. Excellent results have been obtained outperforming other approaches of the state of the art.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper is part of a set of publications related with the development of mathematical models aimed to simulate the dynamic input and output of experimental nondestructive tests in order to detect structural imperfections. The structures to be considered are composed by steel plates of thin thickness. The imperfections in these cases are cracks and they can penetrate either a significant part of the plate thickness or be micro cracks or superficial imperfections. The first class of cracks is related with structural safety and the second one is more connected to the structural protection to the environment, particularly if protective paintings can be deteriorated. Two mathematical groups of models have been developed. The first group tries to locate the position and extension of the imperfection of the first class of imperfections, i.e. cracks and it is the object of the present paper. Bending Kirchoff thin plate models belong to this first group and they are used to this respect. The another group of models is dealt with membrane structures under the superficial Rayleigh waves excitation. With this group of models the micro cracks detection is intended. In the application of the first group of models to the detection of cracks, it has been observed that the differences between the natural frequencies of the non cracked and the cracked structures are very small. However, geometry and crack position can be identified quite accurately if this comparison is carried out between first derivatives (mode rotations) of the natural modes are used instead. Finally, in relation with the analysis of the superficial crack existence the use of Rayleigh waves is very promising. The geometry and the penetration of the micro crack can be detected very accurately. The mathematical and numerical treatment of the generation of these Rayleigh waves present and a numerical application has been shown.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Coincidence detection is important for functions as diverse as Hebbian learning, binaural localization, and visual attention. We show here that extremely precise coincidence detection is a natural consequence of the normal function of rectifying electrical synapses. Such synapses open to bidirectional current flow when presynaptic cells depolarize relative to their postsynaptic targets and remain open until well after completion of presynaptic spikes. When multiple input neurons fire simultaneously, the synaptic currents sum effectively and produce a large excitatory postsynaptic potential. However, when some inputs are delayed relative to the rest, their contributions are reduced because the early excitatory postsynaptic potential retards the opening of additional voltage-sensitive synapses, and the late synaptic currents are shunted by already opened junctions. These mechanisms account for the ability of the lateral giant neurons of crayfish to sum synchronous inputs, but not inputs separated by only 100 μsec. This coincidence detection enables crayfish to produce reflex escape responses only to very abrupt mechanical stimuli. In light of recent evidence that electrical synapses are common in the mammalian central nervous system, the mechanisms of coincidence detection described here may be widely used in many systems.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Detection of a visual signal can be facilitated by simultaneous presentation of a similar subthreshold signal. Here we show that the facilitatory effect of a subthreshold signal can persist for more than 16 s. Presenting a near-threshold Gabor signal (prime) produced a phase-independent increase in contrast sensitivity (40%) to similar successive signals (target) for a period of up to 16 s. This effect was obtained only when both prime and target were presented to the same eye. We further show that the memory trace is inactivated by presenting high-contrast signals before the target. These results suggest that activated neurons in the primary visual cortex retain a near-threshold memory trace that persists until reactivated.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

At early stages in visual processing cells respond to local stimuli with specific features such as orientation and spatial frequency. Although the receptive fields of these cells have been thought to be local and independent, recent physiological and psychophysical evidence has accumulated, indicating that the cells participate in a rich network of local connections. Thus, these local processing units can integrate information over much larger parts of the visual field; the pattern of their response to a stimulus apparently depends on the context presented. To explore the pattern of lateral interactions in human visual cortex under different context conditions we used a novel chain lateral masking detection paradigm, in which human observers performed a detection task in the presence of different length chains of high-contrast-flanked Gabor signals. The results indicated a nonmonotonic relation of the detection threshold with the number of flankers. Remote flankers had a stronger effect on target detection when the space between them was filled with other flankers, indicating that the detection threshold is caused by dynamics of large neuronal populations in the neocortex, with a major interplay between excitation and inhibition. We considered a model of the primary visual cortex as a network consisting of excitatory and inhibitory cell populations, with both short- and long-range interactions. The model exhibited a behavior similar to the experimental results throughout a range of parameters. Experimental and modeling results indicated that long-range connections play an important role in visual perception, possibly mediating the effects of context.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Knowledge of the stage composition and the temporal dynamics of human cognitive operations is critical for building theories of higher mental activity. This information has been difficult to acquire, even with different combinations of techniques such as refined behavioral testing, electrical recording/interference, and metabolic imaging studies. Verbal object comprehension was studied herein in a single individual, by using three tasks (object naming, auditory word comprehension, and visual word comprehension), two languages (English and Farsi), and four techniques (stimulus manipulation, direct cortical electrical interference, electrocorticography, and a variation of the technique of direct cortical electrical interference to produce time-delimited effects, called timeslicing), in a subject in whom indwelling subdural electrode arrays had been placed for clinical purposes. Electrical interference at a pair of electrodes on the left lateral occipitotemporal gyrus interfered with naming in both languages and with comprehension in the language tested (English). The naming and comprehension deficit resulted from interference with processing of verbal object meaning. Electrocorticography indices of cortical activation at this site during naming started 250–300 msec after visual stimulus presentation. By using the timeslicing technique, which varies the onset of electrical interference relative to the behavioral task, we found that completion of processing for verbal object meaning varied from 450 to 750 msec after current onset. This variability was found to be a function of the subject’s familiarity with the objects.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Functional anatomical and single-unit recording studies indicate that a set of neural signals in parietal and frontal cortex mediates the covert allocation of attention to visual locations, as originally proposed by psychological studies. This frontoparietal network is the source of a location bias that interacts with extrastriate regions of the ventral visual system during object analysis to enhance visual processing. The frontoparietal network is not exclusively related to visual attention, but may coincide or overlap with regions involved in oculomotor processing. The relationship between attention and eye movement processes is discussed at the psychological, functional anatomical, and cellular level of analysis.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Working memory is the process of actively maintaining a representation of information for a brief period of time so that it is available for use. In monkeys, visual working memory involves the concerted activity of a distributed neural system, including posterior areas in visual cortex and anterior areas in prefrontal cortex. Within visual cortex, ventral stream areas are selectively involved in object vision, whereas dorsal stream areas are selectively involved in spatial vision. This domain specificity appears to extend forward into prefrontal cortex, with ventrolateral areas involved mainly in working memory for objects and dorsolateral areas involved mainly in working memory for spatial locations. The organization of this distributed neural system for working memory in monkeys appears to be conserved in humans, though some differences between the two species exist. In humans, as compared with monkeys, areas specialized for object vision in the ventral stream have a more inferior location in temporal cortex, whereas areas specialized for spatial vision in the dorsal stream have a more superior location in parietal cortex. Displacement of both sets of visual areas away from the posterior perisylvian cortex may be related to the emergence of language over the course of brain evolution. Whereas areas specialized for object working memory in humans and monkeys are similarly located in ventrolateral prefrontal cortex, those specialized for spatial working memory occupy a more superior and posterior location within dorsal prefrontal cortex in humans than in monkeys. As in posterior cortex, this displacement in frontal cortex also may be related to the emergence of new areas to serve distinctively human cognitive abilities.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The perceived colors of reflecting surfaces generally remain stable despite changes in the spectrum of the illuminating light. This color constancy can be measured operationally by asking observers to distinguish illuminant changes on a scene from changes in the reflecting properties of the surfaces comprising it. It is shown here that during fast illuminant changes, simultaneous changes in spectral reflectance of one or more surfaces in an array of other surfaces can be readily detected almost independent of the numbers of surfaces, suggesting a preattentive, spatially parallel process. This process, which is perfect over a spatial window delimited by the anatomical fovea, may form an early input to a multistage analysis of surface color, providing the visual system with information about a rapidly changing world in advance of the generation of a more elaborate and stable perceptual representation.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

When the illumination of a visual scene changes, the quantity of light reflected from objects is altered. Despite this, the perceived lightness of the objects generally remains constant. This perceptual lightness constancy is thought to be important behaviorally for object recognition. Here we show that interactions from outside the classical receptive fields of neurons in primary visual cortex modulate neural responses in a way that makes them immune to changes in illumination, as is perception. This finding is consistent with the hypothesis that the responses of neurons in primary visual cortex carry information about surface lightness in addition to information about form. It also suggests that lightness constancy, which is sometimes thought to involve “higher-level” processes, is manifest at the first stage of visual cortical processing.