926 resultados para Image Processing, Visual Prostheses, Visual Information, Artificial Human Vision, Visual Perception


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Image filtering is a highly demanded approach of image enhancement in digital imaging systems design. It is widely used in television and camera design technologies to improve the quality of an output image to avoid various problems such as image blurring problem thatgains importance in design of displays of large sizes and design of digital cameras. This thesis proposes a new image filtering method basedon visual characteristics of human eye such as MTF. In contrast to the traditional filtering methods based on human visual characteristics this thesis takes into account the anisotropy of the human eye vision. The proposed method is based on laboratory measurements of the human eye MTF and takes into account degradation of the image by the latter. This method improves an image in the way it will be degraded by human eye MTF to give perception of the original image quality. This thesis gives a basic understanding of an image filtering approach and the concept of MTF and describes an algorithm to perform an image enhancement based on MTF of human eye. Performed experiments have shown quite good results according to human evaluation. Suggestions to improve the algorithm are also given for the future improvements.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Psychophysical studies suggest that humans preferentially use a narrow band of low spatial frequencies for face recognition. Here we asked whether artificial face recognition systems have an improved recognition performance at the same spatial frequencies as humans. To this end, we estimated recognition performance over a large database of face images by computing three discriminability measures: Fisher Linear Discriminant Analysis, Non-Parametric Discriminant Analysis, and Mutual Information. In order to address frequency dependence, discriminabilities were measured as a function of (filtered) image size. All three measures revealed a maximum at the same image sizes, where the spatial frequency content corresponds to the psychophysical found frequencies. Our results therefore support the notion that the critical band of spatial frequencies for face recognition in humans and machines follows from inherent properties of face images, and that the use of these frequencies is associated with optimal face recognition performance.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The aim of this study was to compare the contrast visual processing of concentric sinusoidal gratings stimuli between adolescents and adults. The study included 20 volunteers divided into two groups: 10 adolescents aged 13-19 years (M=16.5, SD=1.65) and 10 adults aged 20-26 years (M=21.8, SD=2.04). In order to measure the contrast sensitivity at spatial frequencies of 0.6, 2.5, 5 and 20 degrees of visual angle (cpd), it was used the psychophysical method of two alternative forced choice (2AFC). A One Way ANOVA performance showed a significant difference in the comparison between groups: F [(4, 237)=3.74, p<.05]. The post-hoc Tukey HSD showed a significant difference between the frequencies of 0.6 (p <.05) and 20 cpd (p<.05). Thus, the results showed that the visual perception behaves differently with regard to the sensory mechanisms that render the contrast towards adolescents and adults. These results are useful to better characterize and comprehend human vision development.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Navigation is a broad topic that has been receiving considerable attention from the mobile robotic community over the years. In order to execute autonomous driving in outdoor urban environments it is necessary to identify parts of the terrain that can be traversed and parts that should be avoided. This paper describes an analyses of terrain identification based on different visual information using a MLP artificial neural network and combining responses of many classifiers. Experimental tests using a vehicle and a video camera have been conducted in real scenarios to evaluate the proposed approach.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The motivation for this thesis work is the need for improving reliability of equipment and quality of service to railway passengers as well as a requirement for cost-effective and efficient condition maintenance management for rail transportation. This thesis work develops a fusion of various machine vision analysis methods to achieve high performance in automation of wooden rail track inspection.The condition monitoring in rail transport is done manually by a human operator where people rely on inference systems and assumptions to develop conclusions. The use of conditional monitoring allows maintenance to be scheduled, or other actions to be taken to avoid the consequences of failure, before the failure occurs. Manual or automated condition monitoring of materials in fields of public transportation like railway, aerial navigation, traffic safety, etc, where safety is of prior importance needs non-destructive testing (NDT).In general, wooden railway sleeper inspection is done manually by a human operator, by moving along the rail sleeper and gathering information by visual and sound analysis for examining the presence of cracks. Human inspectors working on lines visually inspect wooden rails to judge the quality of rail sleeper. In this project work the machine vision system is developed based on the manual visual analysis system, which uses digital cameras and image processing software to perform similar manual inspections. As the manual inspection requires much effort and is expected to be error prone sometimes and also appears difficult to discriminate even for a human operator by the frequent changes in inspected material. The machine vision system developed classifies the condition of material by examining individual pixels of images, processing them and attempting to develop conclusions with the assistance of knowledge bases and features.A pattern recognition approach is developed based on the methodological knowledge from manual procedure. The pattern recognition approach for this thesis work was developed and achieved by a non destructive testing method to identify the flaws in manually done condition monitoring of sleepers.In this method, a test vehicle is designed to capture sleeper images similar to visual inspection by human operator and the raw data for pattern recognition approach is provided from the captured images of the wooden sleepers. The data from the NDT method were further processed and appropriate features were extracted.The collection of data by the NDT method is to achieve high accuracy in reliable classification results. A key idea is to use the non supervised classifier based on the features extracted from the method to discriminate the condition of wooden sleepers in to either good or bad. Self organising map is used as classifier for the wooden sleeper classification.In order to achieve greater integration, the data collected by the machine vision system was made to interface with one another by a strategy called fusion. Data fusion was looked in at two different levels namely sensor-level fusion, feature- level fusion. As the goal was to reduce the accuracy of the human error on the rail sleeper classification as good or bad the results obtained by the feature-level fusion compared to that of the results of actual classification were satisfactory.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This work presents a cooperative navigation systemof a humanoid robot and a wheeled robot using visual information, aiming to navigate the non-instrumented humanoid robot using information obtained from the instrumented wheeled robot. Despite the humanoid not having sensors to its navigation, it can be remotely controlled by infra-red signals. Thus, the wheeled robot can control the humanoid positioning itself behind him and, through visual information, find it and navigate it. The location of the wheeled robot is obtained merging information from odometers and from landmarks detection, using the Extended Kalman Filter. The marks are visually detected, and their features are extracted by image processing. Parameters obtained by image processing are directly used in the Extended Kalman Filter. Thus, while the wheeled robot locates and navigates the humanoid, it also simultaneously calculates its own location and maps the environment (SLAM). The navigation is done through heuristic algorithms based on errors between the actual and desired pose for each robot. The main contribution of this work was the implementation of a cooperative navigation system for two robots based on visual information, which can be extended to other robotic applications, as the ability to control robots without interfering on its hardware, or attaching communication devices

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Human motion seems to be guided by some optimal principles. In general, it is assumed that human walking is generated with minimal energy consumption. However, in the presence of disturbances during gait, there is a trade-off between stability (avoiding a fall) and energy-consumption. This work analyses the obstacle-crossing with the leading foot. It was hypothesized that energy-saving mechanisms during obstacle-crossing are modulated by the requirement to avoid a fall using the available sensory information, particularly, by vision. A total of fourteen subjects, seven with no visual impairment and seven blind, walked along a 5 meter flat pathway with an obstacle of 0.26 m height located at 3 m from the starting point. The seven subjects with normal vision crossed the obstacle successfully 30 times in two conditions: blindfolded and with normal vision. The seven blind subjects did the same 30 times. The motion of the leading limb was recorded by video at 60 Hz. There were markers placed on the subject's hip, knee, ankle, rear foot, and forefoot. The motion data were filtered with a fourth order Butterworth filter with a cut-off frequency of 4 Hz. The following variables were calculated: horizontal distance between the leading foot and the obstacle at toe-off prior to (DHPO) and after (DHOP) crossing, minimal vertical height from the foot to the obstacle (DVPO), average step velocity (VELOm). The segmental energies were also calculated and the work consumed by the leading limb during the crossing obstacle was computed for each trial. A statistical analysis repeated-measures ANOVA was conducted on these dependent variables revealing significant differences between the vision and non-vision conditions in healthy subjects. In addition, there were no significant differences between the blind and people with vision blindfolded. These results indicate that vision is crucial to determine the optimal trade-off between energy consumption and avoiding a trip during obstacle crossing.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Different from the first attempts to solve the image categorization problem (often based on global features), recently, several researchers have been tackling this research branch through a new vantage point - using features around locally invariant interest points and visual dictionaries. Although several advances have been done in the visual dictionaries literature in the past few years, a problem we still need to cope with is calculation of the number of representative words in the dictionary. Therefore, in this paper we introduce a new solution for automatically finding the number of visual words in an N-Way image categorization problem by means of supervised pattern classification based on optimum-path forest. © 2011 IEEE.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This study investigated the influence of top-down and bottom-up information on speech perception in complex listening environments. Specifically, the effects of listening to different types of processed speech were examined on intelligibility and on simultaneous visual-motor performance. The goal was to extend the generalizability of results in speech perception to environments outside of the laboratory. The effect of bottom-up information was evaluated with natural, cell phone and synthetic speech. The effect of simultaneous tasks was evaluated with concurrent visual-motor and memory tasks. Earlier works on the perception of speech during simultaneous visual-motor tasks have shown inconsistent results (Choi, 2004; Strayer & Johnston, 2001). In the present experiments, two dual-task paradigms were constructed in order to mimic non-laboratory listening environments. In the first two experiments, an auditory word repetition task was the primary task and a visual-motor task was the secondary task. Participants were presented with different kinds of speech in a background of multi-speaker babble and were asked to repeat the last word of every sentence while doing the simultaneous tracking task. Word accuracy and visual-motor task performance were measured. Taken together, the results of Experiments 1 and 2 showed that the intelligibility of natural speech was better than synthetic speech and that synthetic speech was better perceived than cell phone speech. The visual-motor methodology was found to demonstrate independent and supplemental information and provided a better understanding of the entire speech perception process. Experiment 3 was conducted to determine whether the automaticity of the tasks (Schneider & Shiffrin, 1977) helped to explain the results of the first two experiments. It was found that cell phone speech allowed better simultaneous pursuit rotor performance only at low intelligibility levels when participants ignored the listening task. Also, simultaneous task performance improved dramatically for natural speech when intelligibility was good. Overall, it could be concluded that knowledge of intelligibility alone is insufficient to characterize processing of different speech sources. Additional measures such as attentional demands and performance of simultaneous tasks were also important in characterizing the perception of different kinds of speech in complex listening environments.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This article presents a novel system and a control strategy for visual following of a 3D moving object by an Unmanned Aerial Vehicle UAV. The presented strategy is based only on the visual information given by an adaptive tracking method based on the color information, which jointly with the dynamics of a camera fixed to a rotary wind UAV are used to develop an Image-based visual servoing IBVS system. This system is focused on continuously following a 3D moving target object, maintaining it with a fixed distance and centered on the image plane. The algorithm is validated on real flights on outdoors scenarios, showing the robustness of the proposed systems against winds perturbations, illumination and weather changes among others. The obtained results indicate that the proposed algorithms is suitable for complex controls task, such object following and pursuit, flying in formation, as well as their use for indoor navigation

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A proposal for a model of the primary visual cortex is reported. It is structured with the basis of a simple unit cell able to perform fourteen pairs of different boolean functions corresponding to the two possible inputs. As a first step, a model of the retina is presented. Different types of responses, according to the different possibilities of interconnecting the building blocks, have been obtained. These responses constitute the basis for an initial configuration of the mammalian primary visual cortex. Some qualitative functions, as symmetry or size of an optical input, have been obtained. A proposal to extend this model to some higher functions, concludes the paper.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

One of the most challenging problems that must be solved by any theoretical model purporting to explain the competence of the human brain for relational tasks is the one related with the analysis and representation of the internal structure in an extended spatial layout of múltiple objects. In this way, some of the problems are related with specific aims as how can we extract and represent spatial relationships among objects, how can we represent the movement of a selected object and so on. The main objective of this paper is the study of some plausible brain structures that can provide answers in these problems. Moreover, in order to achieve a more concrete knowledge, our study will be focused on the response of the retinal layers for optical information processing and how this information can be processed in the first cortex layers. The model to be reported is just a first trial and some major additions are needed to complete the whole vision process.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

El presente proyecto trata sobre uno de los campos más problemáticos de la inteligencia artificial, el reconocimiento facial. Algo tan sencillo para las personas como es reconocer una cara conocida se traduce en complejos algoritmos y miles de datos procesados en cuestión de segundos. El proyecto comienza con un estudio del estado del arte de las diversas técnicas de reconocimiento facial, desde las más utilizadas y probadas como el PCA y el LDA, hasta técnicas experimentales que utilizan imágenes térmicas en lugar de las clásicas con luz visible. A continuación, se ha implementado una aplicación en lenguaje C++ que sea capaz de reconocer a personas almacenadas en su base de datos leyendo directamente imágenes desde una webcam. Para realizar la aplicación, se ha utilizado una de las librerías más extendidas en cuanto a procesado de imágenes y visión artificial, OpenCV. Como IDE se ha escogido Visual Studio 2010, que cuenta con una versión gratuita para estudiantes. La técnica escogida para implementar la aplicación es la del PCA ya que es una técnica básica en el reconocimiento facial, y además sirve de base para soluciones mucho más complejas. Se han estudiado los fundamentos matemáticos de la técnica para entender cómo procesa la información y en qué se datos se basa para realizar el reconocimiento. Por último, se ha implementado un algoritmo de testeo para poder conocer la fiabilidad de la aplicación con varias bases de datos de imágenes faciales. De esta forma, se puede comprobar los puntos fuertes y débiles del PCA. ABSTRACT. This project deals with one of the most problematic areas of artificial intelligence, facial recognition. Something so simple for human as to recognize a familiar face becomes into complex algorithms and thousands of data processed in seconds. The project begins with a study of the state of the art of various face recognition techniques, from the most used and tested as PCA and LDA, to experimental techniques that use thermal images instead of the classic visible light images. Next, an application has been implemented in C + + language that is able to recognize people stored in a database reading images directly from a webcam. To make the application, it has used one of the most outstretched libraries in terms of image processing and computer vision, OpenCV. Visual Studio 2010 has been chosen as the IDE, which has a free student version. The technique chosen to implement the software is the PCA because it is a basic technique in face recognition, and also provides a basis for more complex solutions. The mathematical foundations of the technique have been studied to understand how it processes the information and which data are used to do the recognition. Finally, an algorithm for testing has been implemented to know the reliability of the application with multiple databases of facial images. In this way, the strengths and weaknesses of the PCA can be checked.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

At early stages in visual processing cells respond to local stimuli with specific features such as orientation and spatial frequency. Although the receptive fields of these cells have been thought to be local and independent, recent physiological and psychophysical evidence has accumulated, indicating that the cells participate in a rich network of local connections. Thus, these local processing units can integrate information over much larger parts of the visual field; the pattern of their response to a stimulus apparently depends on the context presented. To explore the pattern of lateral interactions in human visual cortex under different context conditions we used a novel chain lateral masking detection paradigm, in which human observers performed a detection task in the presence of different length chains of high-contrast-flanked Gabor signals. The results indicated a nonmonotonic relation of the detection threshold with the number of flankers. Remote flankers had a stronger effect on target detection when the space between them was filled with other flankers, indicating that the detection threshold is caused by dynamics of large neuronal populations in the neocortex, with a major interplay between excitation and inhibition. We considered a model of the primary visual cortex as a network consisting of excitatory and inhibitory cell populations, with both short- and long-range interactions. The model exhibited a behavior similar to the experimental results throughout a range of parameters. Experimental and modeling results indicated that long-range connections play an important role in visual perception, possibly mediating the effects of context.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Proper understanding of processes underlying visual perception requires information on the activation order of distinct brain areas. We measured dynamics of cortical signals with magnetoencephalography while human subjects viewed stimuli at four visual quadrants. The signals were analyzed with minimum current estimates at the individual and group level. Activation emerged 55–70 ms after stimulus onset both in the primary posterior visual areas and in the anteromedial part of the cuneus. Other cortical areas were active after this initial dual activation. Comparison of data between species suggests that the anteromedial cuneus either comprises a homologue of the monkey area V6 or is an area unique to humans. Our results show that visual stimuli activate two cortical areas right from the beginning of the cortical response. The anteromedial cuneus has the temporal position needed to interact with the primary visual cortex V1 and thereby to modify information transferred via V1 to extrastriate cortices.