916 resultados para human vision
Resumo:
An object in the peripheral visual field is more difficult to recognize when surrounded by other objects. This phenomenon is called "crowding". Crowding places a fundamental constraint on human vision that limits performance on numerous tasks. It has been suggested that crowding results from spatial feature integration necessary for object recognition. However, in the absence of convincing models, this theory has remained controversial. Here, we present a quantitative and physiologically plausible model for spatial integration of orientation signals, based on the principles of population coding. Using simulations, we demonstrate that this model coherently accounts for fundamental properties of crowding, including critical spacing, "compulsory averaging", and a foveal-peripheral anisotropy. Moreover, we show that the model predicts increased responses to correlated visual stimuli. Altogether, these results suggest that crowding has little immediate bearing on object recognition but is a by-product of a general, elementary integration mechanism in early vision aimed at improving signal quality.
Resumo:
The fact of change in human vision sensitivity in color adaption was studied with the method of minimo step change. In this experiment, subjects were asked to judge the just-different threshold on 1° target stimulus which superimposed on 6° background. The target and the surrourding field flicked and different kinds of stimuli were shown to Ss. Data of Pre-color adaption and that of post-color adaption have been compared and ① Evidence of opponent effect is found. This result is contraversary to coefficient law and the "two process" hypothesis are supported; ② Opponent effect is strongly related to the kinds of stimuli ③ When the background light flicks, subjects sensitivity to target stimuli is obviously increased while under the condition of flicking target stimuli less increase can be found. But the increase in sensitivity seem to be not related with color adaption.
Resumo:
We report a series of psychophysical experiments that explore different aspects of the problem of object representation and recognition in human vision. Contrary to the paradigmatic view which holds that the representations are three-dimensional and object-centered, the results consistently support the notion of view-specific representations that include at most partial depth information. In simulated experiments that involved the same stimuli shown to the human subjects, computational models built around two-dimensional multiple-view representations replicated our main psychophysical results, including patterns of generalization errors and the time course of perceptual learning.
Resumo:
We provide a theory of the three-dimensional interpretation of a class of line-drawings called p-images, which are interpreted by the human vision system as parallelepipeds ("boxes"). Despite their simplicity, p-images raise a number of interesting vision questions: *Why are p-images seen as three-dimensional objects? Why not just as flatimages? *What are the dimensions and pose of the perceived objects? *Why are some p-images interpreted as rectangular boxes, while others are seen as skewed, even though there is no obvious distinction between the images? *When p-images are rotated in three dimensions, why are the image-sequences perceived as distorting objects---even though structure-from-motion would predict that rigid objects would be seen? *Why are some three-dimensional parallelepipeds seen as radically different when viewed from different viewpoints? We show that these and related questions can be answered with the help of a single mathematical result and an associated perceptual principle. An interesting special case arises when there are right angles in the p-image. This case represents a singularity in the equations and is mystifying from the vision point of view. It would seem that (at least in this case) the vision system does not follow the ordinary rules of geometry but operates in accordance with other (and as yet unknown) principles.
Resumo:
Under normal viewing conditions, humans find it easy to distinguish between objects made out of different materials such as plastic, metal, or paper. Untextured materials such as these have different surface reflectance properties, including lightness and gloss. With single isolated images and unknown illumination conditions, the task of estimating surface reflectance is highly underconstrained, because many combinations of reflection and illumination are consistent with a given image. In order to work out how humans estimate surface reflectance properties, we asked subjects to match the appearance of isolated spheres taken out of their original contexts. We found that subjects were able to perform the task accurately and reliably without contextual information to specify the illumination. The spheres were rendered under a variety of artificial illuminations, such as a single point light source, and a number of photographically-captured real-world illuminations from both indoor and outdoor scenes. Subjects performed more accurately for stimuli viewed under real-world patterns of illumination than under artificial illuminations, suggesting that subjects use stored assumptions about the regularities of real-world illuminations to solve the ill-posed problem.
Resumo:
Log-polar image architectures, motivated by the structure of the human visual field, have long been investigated in computer vision for use in estimating motion parameters from an optical flow vector field. Practical problems with this approach have been: (i) dependence on assumed alignment of the visual and motion axes; (ii) sensitivity to occlusion form moving and stationary objects in the central visual field, where much of the numerical sensitivity is concentrated; and (iii) inaccuracy of the log-polar architecture (which is an approximation to the central 20°) for wide-field biological vision. In the present paper, we show that an algorithm based on generalization of the log-polar architecture; termed the log-dipolar sensor, provides a large improvement in performance relative to the usual log-polar sampling. Specifically, our algorithm: (i) is tolerant of large misalignmnet of the optical and motion axes; (ii) is insensitive to significant occlusion by objects of unknown motion; and (iii) represents a more correct analogy to the wide-field structure of human vision. Using the Helmholtz-Hodge decomposition to estimate the optical flow vector field on a log-dipolar sensor, we demonstrate these advantages, using synthetic optical flow maps as well as natural image sequences.
Resumo:
Apparent reversals in rotating trapezia have been regarded as evidence that human vision favours methods which are heuristic or form dependent. However, the argument is based on the assumption that general algorithmic methods would avoid the illusion, and that has never been clear. A general algorithm for interpreting moving parallels has been developed to address the issue. It handles a considerable range of stimuli successfully, but finds multiple interpretations in situations which correspond closely to those where apparent reversals occur. This strengthens the hypothesis that apparent reversals may occur when general algorithmic methods fail and heuristics are invoked as a stopgap.
Resumo:
The interpretations people attach to line drawings reflect shape-related processes in human vision. Their divergences from expectations embodied in related machine vision traditions are summarized, and used to suggest how human vision decomposes the task of interpretation. A model called IO implements this idea. It first identifies geometrically regular, local fragments. Initial decisions fix edge orientations, and this information constrains decisions about other properties. Relations between fragments are explored, beginning with weak consistency checks and moving to fuller ones. IO's output captures multiple distinctive characteristics of human performance, and it suggests steady progress towards understanding shape-related visual processes is possible.
Resumo:
The aim of this study was to compare the contrast visual processing of concentric sinusoidal gratings stimuli between adolescents and adults. The study included 20 volunteers divided into two groups: 10 adolescents aged 13-19 years (M=16.5, SD=1.65) and 10 adults aged 20-26 years (M=21.8, SD=2.04). In order to measure the contrast sensitivity at spatial frequencies of 0.6, 2.5, 5 and 20 degrees of visual angle (cpd), it was used the psychophysical method of two alternative forced choice (2AFC). A One Way ANOVA performance showed a significant difference in the comparison between groups: F [(4, 237)=3.74, p<.05]. The post-hoc Tukey HSD showed a significant difference between the frequencies of 0.6 (p <.05) and 20 cpd (p<.05). Thus, the results showed that the visual perception behaves differently with regard to the sensory mechanisms that render the contrast towards adolescents and adults. These results are useful to better characterize and comprehend human vision development.
Resumo:
The human visual ability to perceive depth looks like a puzzle. We perceive three-dimensional spatial information quickly and efficiently by using the binocular stereopsis of our eyes and, what is mote important the learning of the most common objects which we achieved through living. Nowadays, modelling the behaviour of our brain is a fiction, that is why the huge problem of 3D perception and further, interpretation is split into a sequence of easier problems. A lot of research is involved in robot vision in order to obtain 3D information of the surrounded scene. Most of this research is based on modelling the stereopsis of humans by using two cameras as if they were two eyes. This method is known as stereo vision and has been widely studied in the past and is being studied at present, and a lot of work will be surely done in the future. This fact allows us to affirm that this topic is one of the most interesting ones in computer vision. The stereo vision principle is based on obtaining the three dimensional position of an object point from the position of its projective points in both camera image planes. However, before inferring 3D information, the mathematical models of both cameras have to be known. This step is known as camera calibration and is broadly describes in the thesis. Perhaps the most important problem in stereo vision is the determination of the pair of homologue points in the two images, known as the correspondence problem, and it is also one of the most difficult problems to be solved which is currently investigated by a lot of researchers. The epipolar geometry allows us to reduce the correspondence problem. An approach to the epipolar geometry is describes in the thesis. Nevertheless, it does not solve it at all as a lot of considerations have to be taken into account. As an example we have to consider points without correspondence due to a surface occlusion or simply due to a projection out of the camera scope. The interest of the thesis is focused on structured light which has been considered as one of the most frequently used techniques in order to reduce the problems related lo stereo vision. Structured light is based on the relationship between a projected light pattern its projection and an image sensor. The deformations between the pattern projected into the scene and the one captured by the camera, permits to obtain three dimensional information of the illuminated scene. This technique has been widely used in such applications as: 3D object reconstruction, robot navigation, quality control, and so on. Although the projection of regular patterns solve the problem of points without match, it does not solve the problem of multiple matching, which leads us to use hard computing algorithms in order to search the correct matches. In recent years, another structured light technique has increased in importance. This technique is based on the codification of the light projected on the scene in order to be used as a tool to obtain an unique match. Each token of light is imaged by the camera, we have to read the label (decode the pattern) in order to solve the correspondence problem. The advantages and disadvantages of stereo vision against structured light and a survey on coded structured light are related and discussed. The work carried out in the frame of this thesis has permitted to present a new coded structured light pattern which solves the correspondence problem uniquely and robust. Unique, as each token of light is coded by a different word which removes the problem of multiple matching. Robust, since the pattern has been coded using the position of each token of light with respect to both co-ordinate axis. Algorithms and experimental results are included in the thesis. The reader can see examples 3D measurement of static objects, and the more complicated measurement of moving objects. The technique can be used in both cases as the pattern is coded by a single projection shot. Then it can be used in several applications of robot vision. Our interest is focused on the mathematical study of the camera and pattern projector models. We are also interested in how these models can be obtained by calibration, and how they can be used to obtained three dimensional information from two correspondence points. Furthermore, we have studied structured light and coded structured light, and we have presented a new coded structured light pattern. However, in this thesis we started from the assumption that the correspondence points could be well-segmented from the captured image. Computer vision constitutes a huge problem and a lot of work is being done at all levels of human vision modelling, starting from a)image acquisition; b) further image enhancement, filtering and processing, c) image segmentation which involves thresholding, thinning, contour detection, texture and colour analysis, and so on. The interest of this thesis starts in the next step, usually known as depth perception or 3D measurement.
Resumo:
La tesis se centra en la Visión por Computador y, más concretamente, en la segmentación de imágenes, la cual es una de las etapas básicas en el análisis de imágenes y consiste en la división de la imagen en un conjunto de regiones visualmente distintas y uniformes considerando su intensidad, color o textura. Se propone una estrategia basada en el uso complementario de la información de región y de frontera durante el proceso de segmentación, integración que permite paliar algunos de los problemas básicos de la segmentación tradicional. La información de frontera permite inicialmente identificar el número de regiones presentes en la imagen y colocar en el interior de cada una de ellas una semilla, con el objetivo de modelar estadísticamente las características de las regiones y definir de esta forma la información de región. Esta información, conjuntamente con la información de frontera, es utilizada en la definición de una función de energía que expresa las propiedades requeridas a la segmentación deseada: uniformidad en el interior de las regiones y contraste con las regiones vecinas en los límites. Un conjunto de regiones activas inician entonces su crecimiento, compitiendo por los píxeles de la imagen, con el objetivo de optimizar la función de energía o, en otras palabras, encontrar la segmentación que mejor se adecua a los requerimientos exprsados en dicha función. Finalmente, todo esta proceso ha sido considerado en una estructura piramidal, lo que nos permite refinar progresivamente el resultado de la segmentación y mejorar su coste computacional. La estrategia ha sido extendida al problema de segmentación de texturas, lo que implica algunas consideraciones básicas como el modelaje de las regiones a partir de un conjunto de características de textura y la extracción de la información de frontera cuando la textura es presente en la imagen. Finalmente, se ha llevado a cabo la extensión a la segmentación de imágenes teniendo en cuenta las propiedades de color y textura. En este sentido, el uso conjunto de técnicas no-paramétricas de estimación de la función de densidad para la descripción del color, y de características textuales basadas en la matriz de co-ocurrencia, ha sido propuesto para modelar adecuadamente y de forma completa las regiones de la imagen. La propuesta ha sido evaluada de forma objetiva y comparada con distintas técnicas de integración utilizando imágenes sintéticas. Además, se han incluido experimentos con imágenes reales con resultados muy positivos.
Resumo:
In this paper we report the degree of reliability of image sequences taken by off-the-shelf TV cameras for modeling camera rotation and reconstructing 3D structure using computer vision techniques. This is done in spite of the fact that computer vision systems usually use imaging devices that are specifically designed for the human vision. Our scenario consists of a static scene and a mobile camera moving through the scene. The scene is any long axial building dominated by features along the three principal orientations and with at least one wall containing prominent repetitive planar features such as doors, windows bricks etc. The camera is an ordinary commercial camcorder moving along the axial axis of the scene and is allowed to rotate freely within the range +/- 10 degrees in all directions. This makes it possible that the camera be held by a walking unprofessional cameraman with normal gait, or to be mounted on a mobile robot. The system has been tested successfully on sequence of images of a variety of structured, but fairly cluttered scenes taken by different walking cameramen. The potential application areas of the system include medicine, robotics and photogrammetry.
Resumo:
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)
Resumo:
A agressividade na infância tem sido apresentada como queixa recorrente por pais e educadores, o que instaura um cenário preocupante na medida em que a identificação da criança e sua família como principais responsáveis ainda é acentuada. Neste estudo apresenta-se inicialmente uma compreensão da agressividade na infância a partir da Abordagem Centrada na Pessoa, assim como a proposta de educação neste referencial articulando com alguns princípios da teoria da complexidade. Considerando que a criança reconhecida como agressiva vem se constituído num processo de subjetivação no qual as pessoas socialmente significativas a ela estão implicadas, esta pesquisa por meio de uma investigação fenomenológica teve como objetivo verificar a configuração deste reconhecimento a partir da análise dos depoimentos dos participantes: a criança identificada como agressiva, um colega, a mãe e a professora. O estudo foi realizado numa escola selecionada a partir do mapeamento feito pelo Observatório de Violência nas Escolas Núcleo-Pa. Os resultados encontrados apontam para: uma visão de subjetividade linear subsidiando as forma de relacionar; o distanciamento docente utilizado como recurso para evitar o conflito; a agressividade manifestada denunciando as histórias pregressas do aluno e a vivência atual; uma relação entre a condição da criança reagir ao rótulo e a história familiar; as repercussões da forma como é reconhecida na escola em seu processo de aprendizagem. Os sentimentos vivenciados pelos participantes permitem alertar para a condição de implicabilidade que os envolve, reafirmando a necessidade de se buscar caminhos que promovam mudanças na forma de ver o aluno e a própria escola. Tais mudanças precisam ser instauradas a partir de uma visão de subjetividade humana interativa complexa, que possibilite entender a agressividade a partir de um cenário intersubjetivo que pode revelar múltiplos significados.
Resumo:
O potencial provocado visual (VEP) é uma resposta cortical registrável na superfície do couro cabeludo, que reflete a atividade dos neurônios de V1. É classificado, a partir da freqüência temporal de estimulação, em transiente ou de estado estacionário. Outras propriedades do estímulo parecem provocar uma atividade seletiva dos diversos grupos de neurônios existentes em V1. Desse modo, o VEP vem sendo usado para estudar a visão humana acromática e cromática. Diversos trabalhos usaram o VEP para estimar a sensibilidade ao contraste de luminância no domínio das freqüências espaciais. Mais recentemente, há estudos que empregaram o VEP para medir os limiares de discriminação de cores. O VEP transiente pode complementar as medidas psicofísicas de sensibilidade ao contraste espacial de luminância e de discriminação cromática, e constitui um método não invasivo para estudar a visão de indivíduos com dificuldades de realizar testes psicofísicos.