830 resultados para Vision computationnelle


Relevância:

20.00% 20.00%

Publicador:

Resumo:

La evolución de los teléfonos móviles inteligentes, dotados de cámaras digitales, está provocando una creciente demanda de aplicaciones cada vez más complejas que necesitan algoritmos de visión artificial en tiempo real; puesto que el tamaño de las señales de vídeo no hace sino aumentar y en cambio el rendimiento de los procesadores de un solo núcleo se ha estancado, los nuevos algoritmos que se diseñen para visión artificial han de ser paralelos para poder ejecutarse en múltiples procesadores y ser computacionalmente escalables. Una de las clases de procesadores más interesantes en la actualidad se encuentra en las tarjetas gráficas (GPU), que son dispositivos que ofrecen un alto grado de paralelismo, un excelente rendimiento numérico y una creciente versatilidad, lo que los hace interesantes para llevar a cabo computación científica. En esta tesis se exploran dos aplicaciones de visión artificial que revisten una gran complejidad computacional y no pueden ser ejecutadas en tiempo real empleando procesadores tradicionales. En cambio, como se demuestra en esta tesis, la paralelización de las distintas subtareas y su implementación sobre una GPU arrojan los resultados deseados de ejecución con tasas de refresco interactivas. Asimismo, se propone una técnica para la evaluación rápida de funciones de complejidad arbitraria especialmente indicada para su uso en una GPU. En primer lugar se estudia la aplicación de técnicas de síntesis de imágenes virtuales a partir de únicamente dos cámaras lejanas y no paralelas—en contraste con la configuración habitual en TV 3D de cámaras cercanas y paralelas—con información de color y profundidad. Empleando filtros de mediana modificados para la elaboración de un mapa de profundidad virtual y proyecciones inversas, se comprueba que estas técnicas son adecuadas para una libre elección del punto de vista. Además, se demuestra que la codificación de la información de profundidad con respecto a un sistema de referencia global es sumamente perjudicial y debería ser evitada. Por otro lado se propone un sistema de detección de objetos móviles basado en técnicas de estimación de densidad con funciones locales. Este tipo de técnicas es muy adecuada para el modelado de escenas complejas con fondos multimodales, pero ha recibido poco uso debido a su gran complejidad computacional. El sistema propuesto, implementado en tiempo real sobre una GPU, incluye propuestas para la estimación dinámica de los anchos de banda de las funciones locales, actualización selectiva del modelo de fondo, actualización de la posición de las muestras de referencia del modelo de primer plano empleando un filtro de partículas multirregión y selección automática de regiones de interés para reducir el coste computacional. Los resultados, evaluados sobre diversas bases de datos y comparados con otros algoritmos del estado del arte, demuestran la gran versatilidad y calidad de la propuesta. Finalmente se propone un método para la aproximación de funciones arbitrarias empleando funciones continuas lineales a tramos, especialmente indicada para su implementación en una GPU mediante el uso de las unidades de filtraje de texturas, normalmente no utilizadas para cómputo numérico. La propuesta incluye un riguroso análisis matemático del error cometido en la aproximación en función del número de muestras empleadas, así como un método para la obtención de una partición cuasióptima del dominio de la función para minimizar el error. ABSTRACT The evolution of smartphones, all equipped with digital cameras, is driving a growing demand for ever more complex applications that need to rely on real-time computer vision algorithms. However, video signals are only increasing in size, whereas the performance of single-core processors has somewhat stagnated in the past few years. Consequently, new computer vision algorithms will need to be parallel to run on multiple processors and be computationally scalable. One of the most promising classes of processors nowadays can be found in graphics processing units (GPU). These are devices offering a high parallelism degree, excellent numerical performance and increasing versatility, which makes them interesting to run scientific computations. In this thesis, we explore two computer vision applications with a high computational complexity that precludes them from running in real time on traditional uniprocessors. However, we show that by parallelizing subtasks and implementing them on a GPU, both applications attain their goals of running at interactive frame rates. In addition, we propose a technique for fast evaluation of arbitrarily complex functions, specially designed for GPU implementation. First, we explore the application of depth-image–based rendering techniques to the unusual configuration of two convergent, wide baseline cameras, in contrast to the usual configuration used in 3D TV, which are narrow baseline, parallel cameras. By using a backward mapping approach with a depth inpainting scheme based on median filters, we show that these techniques are adequate for free viewpoint video applications. In addition, we show that referring depth information to a global reference system is ill-advised and should be avoided. Then, we propose a background subtraction system based on kernel density estimation techniques. These techniques are very adequate for modelling complex scenes featuring multimodal backgrounds, but have not been so popular due to their huge computational and memory complexity. The proposed system, implemented in real time on a GPU, features novel proposals for dynamic kernel bandwidth estimation for the background model, selective update of the background model, update of the position of reference samples of the foreground model using a multi-region particle filter, and automatic selection of regions of interest to reduce computational cost. The results, evaluated on several databases and compared to other state-of-the-art algorithms, demonstrate the high quality and versatility of our proposal. Finally, we propose a general method for the approximation of arbitrarily complex functions using continuous piecewise linear functions, specially formulated for GPU implementation by leveraging their texture filtering units, normally unused for numerical computation. Our proposal features a rigorous mathematical analysis of the approximation error in function of the number of samples, as well as a method to obtain a suboptimal partition of the domain of the function to minimize approximation error.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper presents a completely autonomous solution to participate in the Indoor Challenge of the 2013 International Micro Air Vehicle Competition (IMAV 2013). Our proposal is a multi-robot system with no centralized coordination whose robotic agents share their position estimates. The capability of each agent to navigate avoiding collisions is a consequence of the resulting emergent behavior. Each agent consists of a ground station running an instance of the proposed architecture that communicates over WiFi with an AR Drone 2.0 quadrotor. Visual markers are employed to sense and map obstacles and to improve the pose estimation based on Inertial Measurement Unit (IMU) and ground optical flow data. Based on our architecture, each robotic agent can navigate avoiding obstacles and other members of the multi-robot system. The solution is demonstrated and the achieved navigation performance is evaluated by means of experimental flights. This work also analyzes the capabilities of the presented solution in simulated flights of the IMAV 2013 Indoor Challenge. The performance of the CVG UPM team was awarded with the First Prize in the Indoor Autonomy Challenge of the IMAV 2013 competition.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

https://bluetigercommons.lincolnu.edu/pli/1005/thumbnail.jpg

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Praying mantids use binocular cues to judge whether their prey is in striking distance. When there are several moving targets within their binocular visual field, mantids need to solve the correspondence problem. They must select between the possible pairings of retinal images in the two eyes so that they can strike at a single real target. In this study, mantids were presented with two targets in various configurations, and the resulting fixating saccades that precede the strike were analyzed. The distributions of saccades show that mantids consistently prefer one out of several possible matches. Selection is in part guided by the position and the spatiotemporal features of the target image in each eye. Selection also depends upon the binocular disparity of the images, suggesting that insects can perform local binocular computations. The pairing rules ensure that mantids tend to aim at real targets and not at “ghost” targets arising from false matches.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

It is known that the squirrel monkey, marmoset, and other related New World (NW) monkeys possess three high-frequency alleles at the single X-linked photopigment locus, and that the spectral sensitivity peaks of these alleles are within those delimited by the human red and green pigment genes. The three alleles in the squirrel monkey and marmoset have been sequenced previously. In this study, the three alleles were found and sequenced in the saki monkey, capuchin, and tamarin. Although the capuchin and tamarin belong to the same family as the squirrel monkey and marmoset, the saki monkey belongs to a different family and is one of the species that is most divergent from the squirrel monkey and marmoset, suggesting the presence of the triallelic system in many NW monkeys. The nucleotide sequences of these alleles from the five species studied indicate that gene conversion occurs frequently and has partially or completely homogenized intronic and exonic regions of the alleles in each species, making it appear that a triallelic system arose independently in each of the five species studied. Nevertheless, a detailed analysis suggests that the triallelic system arose only once in the NW monkey lineage, from a middle wavelength (green) opsin gene, and that the amino acid differences at functionally critical sites among alleles have been maintained by natural selection in NW monkeys for >20 million years. Moreover, the two X-linked opsin genes of howler monkeys (a NW monkey genus) were evidently derived from the incorporation of a middle (green) and a long wavelength (red) allele into one chromosome; these two genes together with the (autosomal) blue opsin gene would immediately enable even a male monkey to have trichromatic vision.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Deciphering the information that eyes, ears, and other sensory organs transmit to the brain is important for understanding the neural basis of behavior. Recordings from single sensory nerve cells have yielded useful insights, but single neurons generally do not mediate behavior; networks of neurons do. Monitoring the activity of all cells in a neural network of a behaving animal, however, is not yet possible. Taking an alternative approach, we used a realistic cell-based model to compute the ensemble of neural activity generated by one sensory organ, the lateral eye of the horseshoe crab, Limulus polyphemus. We studied how the neural network of this eye encodes natural scenes by presenting to the model movies recorded with a video camera mounted above the eye of an animal that was exploring its underwater habitat. Model predictions were confirmed by simultaneously recording responses from single optic nerve fibers of the same animal. We report here that the eye transmits to the brain robust “neural images” of objects having the size, contrast, and motion of potential mates. The neural code for such objects is not found in ambiguous messages of individual optic nerve fibers but rather in patterns of coherent activity that extend over small ensembles of nerve fibers and are bound together by stimulus motion. Integrative properties of neurons in the first synaptic layer of the brain appear well suited to detecting the patterns of coherent activity. Neural coding by this relatively simple eye helps explain how horseshoe crabs find mates and may lead to a better understanding of how more complex sensory organs process information.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The coelacanth, a “living fossil,” lives near the coast of the Comoros archipelago in the Indian Ocean. Living at a depth of about 200 m, the Comoran coelacanth receives only a narrow range of light, at about 480 nm. To detect the entire range of “color” at this depth, the coelacanth appears to use only two closely related paralogous RH1 and RH2 visual pigments with the optimum light sensitivities (λmax) at 478 nm and 485 nm, respectively. The λmax values are shifted about 20 nm toward blue compared with those of the corresponding orthologous pigments. Mutagenesis experiments show that each of these coadapted changes is fully explained by two amino acid replacements.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Blindsight is the rare and paradoxical ability of some human subjects with occipital lobe brain damage to discriminate unseen stimuli in their clinically blind field defects when forced-choice procedures are used, implying that lesions of striate cortex produce a sharp dissociation between visual performance and visual awareness. Skeptics have argued that this is no different from the behavior of normal subjects at the lower limits of conscious vision, at which such dissociations could arise trivially by using different response criteria during clinical and forced-choice tests. We tested this claim explicitly by measuring the sensitivity of a hemianopic patient independently of his response criterion in yes-no and forced-choice detection tasks with the same stimulus and found that, unlike normal controls, his sensitivity was significantly higher during the forced-choice task. Thus, the dissociation by which blindsight is defined is not simply due to a difference in the patients’ response bias between the two paradigms. This result implies that blindsight is unlike normal, near-threshold vision and that information about the stimulus is processed in blindsighted patients in an unusual way.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Objective: To assess whether population screening for impaired vision among older people in the community leads to improvements in vision.

Relevância:

20.00% 20.00%

Publicador:

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The minimum levels of staffing, services, budget, and technology that should be provided by a library specializing in vision science are presented. The scope and coverage of the collection is described as well. These standards may be used by institutions establishing libraries or by accrediting bodies reviewing existing libraries.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Efficient and reliable classification of visual stimuli requires that their representations reside a low-dimensional and, therefore, computationally manageable feature space. We investigated the ability of the human visual system to derive such representations from the sensory input-a highly nontrivial task, given the million or so dimensions of the visual signal at its entry point to the cortex. In a series of experiments, subjects were presented with sets of parametrically defined shapes; the points in the common high-dimensional parameter space corresponding to the individual shapes formed regular planar (two-dimensional) patterns such as a triangle, a square, etc. We then used multidimensional scaling to arrange the shapes in planar configurations, dictated by their experimentally determined perceived similarities. The resulting configurations closely resembled the original arrangements of the stimuli in the parameter space. This achievement of the human visual system was replicated by a computational model derived from a theory of object representation in the brain, according to which similarities between objects, and not the geometry of each object, need to be faithfully represented.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The majority of neurons in the primary visual cortex of primates can be activated by stimulation of either eye; moreover, the monocular receptive fields of such neurons are located in about the same region of visual space. These well-known facts imply that binocular convergence in visual cortex can explain our cyclopean view of the world. To test the adequacy of this assumption, we examined how human subjects integrate binocular events in time. Light flashes presented synchronously to both eyes were compared to flashes presented alternately (asynchronously) to one eye and then the other. Subjects perceived very-low-frequency (2 Hz) asynchronous trains as equivalent to synchronous trains flashed at twice the frequency (the prediction based on binocular convergence). However, at higher frequencies of presentation (4-32 Hz), subjects perceived asynchronous and synchronous trains to be increasingly similar. Indeed, at the flicker-fusion frequency (approximately 50 Hz), the apparent difference between the two conditions was only 2%. We suggest that the explanation of these anomalous findings is that we parse visual input into sequential episodes.

Relevância:

20.00% 20.00%

Publicador: