992 resultados para Visão humana
Resumo:
Optics consists in the study of interaction of light with physical systems. The human vision is a product of the interaction of light with the eye (a very peculiar physical system). Here we present a basic study of the relationship between the optics and human vision, including: - The fundaments and physicals properties who characterize the light and the colors and the characteristics of the incidence mediums. - The basics laws of geometrical optics, based in the rectilinear propagation of light in the form of a light ray, in the independence of light rays and in the principle of reversibility of the light beams. This principle is present in the process of image formations in lenses and mirrors and applied in the study of image formation in the human eye. - The refraction and reflection laws and types of lenses, who permits the construction of optics devices for the study of physical universe, and the appliances to correct vision diseases. - Presents the human vision process as consisting in the reception of light (electromagnetic radiation in the zone of wavelength visible to us) through the eye and the sending of information obtained by the retina to the brain where it is interpreted. The vision involves a biophysical relation between the light and the biological structure of the eye who is constituted by cornea, iris, crystalline and retina. Analyzes is made of how some parts of the eye performs a function in the reception and sending of information of the images to the brain
Resumo:
A cor é um atributo perceptual que nos permite identificar e localizar padrões ambientais de mesmo brilho e constitui uma dimensão adicional na identificação de objetos, além da detecção de inúmeros outros atributos dos objetos em sua relação com a cena visual, como luminância, contraste, forma, movimento, textura, profundidade. Decorre daí a sua importância fundamental nas atividades desempenhadas pelos animais e pelos seres humanos em sua interação com o ambiente. A psicofísica visual preocupa-se com o estudo quantitativo da relação entre eventos físicos de estimulação sensorial e a resposta comportamental resultante desta estimulação, fornecendo dessa maneira meios de avaliar aspectos da visão humana, como a visão de cores. Este artigo tem o objetivo de mostrar diversas técnicas eficientes na avaliação da visão cromática humana através de métodos psicofísicos adaptativos.
Resumo:
Face detection and recognition should be complemented by recognition of facial expression, for example for social robots which must react to human emotions. Our framework is based on two multi-scale representations in cortical area V1: keypoints at eyes, nose and mouth are grouped for face detection [1]; lines and edges provide information for face recognition [2].
Resumo:
The primary visual cortex employs simple, complex and end-stopped cells to create a scale space of 1D singularities (lines and edges) and of 2D singularities (line and edge junctions and crossings called keypoints). In this paper we show first results of a biological model which attributes information of the local image structure to keypoints at all scales, ie junction type (L, T, +) and main line/edge orientations. Keypoint annotation in combination with coarse to fine scale processing facilitates various processes, such as image matching (stereo and optical flow), object segregation and object tracking.
Resumo:
In this paper we present an improved model for line and edge detection in cortical area V1. This model is based on responses of simple and complex cells, and it is multi-scale with no free parameters. We illustrate the use of the multi-scale line/edge representation in different processes: visual reconstruction or brightness perception, automatic scale selection and object segregation. A two-level object categorization scenario is tested in which pre-categorization is based on coarse scales only and final categorization on coarse plus fine scales. We also present a multi-scale object and face recognition model. Processing schemes are discussed in the framework of a complete cortical architecture. The fact that brightness perception and object recognition may be based on the same symbolic image representation is an indication that the entire (visual) cortex is involved in consciousness.
Resumo:
Most simultaneous localisation and mapping (SLAM) solutions were developed for navigation of non-cognitive robots. By using a variety of sensors, the distances to walls and other objects are determined, which are then used to generate a map of the environment and to update the robot’s position. When developing a cognitive robot, such a solution is not appropriate since it requires accurate sensors and precise odometry, also lacking fundamental features of cognition such as time and memory. In this paper we present a SLAM solution in which such features are taken into account and integrated. Moreover, this method does not require precise odometry nor accurate ranging sensors.
Resumo:
Empirical studies concerning face recognition suggest that faces may be stored in memory by a few canonical representations. Models of visual perception are based on image representations in cortical area V1 and beyond, which contain many cell layers for feature extraction. Simple, complex and end-stopped cells provide input for line, edge and keypoint detection. Detected events provide a rich, multi-scale object representation, and this representation can be stored in memory in order to identify objects. In this paper, the above context is applied to face recognition. The multi-scale line/edge representation is explored in conjunction with keypoint-based saliency maps for Focus-of-Attention. Recognition rates of up to 96% were achieved by combining frontal and 3/4 views, and recognition was quite robust against partial occlusions.
Resumo:
Empirical studies concerning face recognition suggest that faces may be stored in memory by a few canonical representations. In cortical area V1 exist double-opponent colour blobs, also simple, complex and end-stopped cells which provide input for a multiscale line/edge representation, keypoints for dynamic routing and saliency maps for Focus-of-Attention. All these combined allow us to segregate faces. Events of different facial views are stored in memory and combined in order to identify the view and recognise the face including facial expression. In this paper we show that with five 2D views and their cortical representations it is possible to determine the left-right and frontal-lateral-profile views and to achieve view-invariant recognition of 3D faces.
Resumo:
Complete image ontology can be obtained by formalising a top-down meta-language wich must address all possibilities, from global message and composition to objects and local surface properties.
Resumo:
In his introduction, Pinna (2010) quoted one of Wertheimer’s observations: “I stand at the window and see a house, trees, sky. Theoretically I might say there were 327 brightnesses and nuances of color. Do I have ‘327’? No. I have sky, house, and trees.” This seems quite remarkable, for Max Wertheimer, together with Kurt Koffka and Wolfgang Koehler, was a pioneer of Gestalt Theory: perceptual organisation was tackled considering grouping rules of line and edge elements in relation to figure-ground segregation, i.e., a meaningful object (the figure) as perceived against a complex background (the ground). At the lowest level – line and edge elements – Wertheimer (1923) himself formulated grouping principles on the basis of proximity, good continuation, convexity, symmetry and, often forgotten, past experience of the observer. Rubin (1921) formulated rules for figure-ground segregation using surroundedness, size and orientation, but also convexity and symmetry. Almost a century of research into Gestalt later, Pinna and Reeves (2006) introduced the notion of figurality, meant to represent the integrated set of properties of visual objects, from the principles of grouping and figure-ground to the colour and volume of objects with shading. Pinna, in 2010, went one important step further and studied perceptual meaning, i.e., the interpretation of complex figures on the basis of past experience of the observer. Re-establishing a link to Wertheimer’s rule about past experience, he formulated five propositions, three definitions and seven properties on the basis of observations made on graphically manipulated patterns. For example, he introduced the illusion of meaning by comics-like elements suggesting wind, therefore inducing a learned interpretation. His last figure shows a regular array of squares but with irregular positions on the right side. This pile of (ir)regular squares can be interpreted as the result of an earthquake which destroyed part of an apartment block. This is much more intuitive, direct and economic than describing the complexity of the array of squares.
Resumo:
Attention is usually modelled by sequential fixation of peaks in saliency maps. Those maps code local conspicuity: complexity, colour and texture. Such features have no relation to entire objects, unless also disparity and optical flow are considered, which often segregate entire objects from their background. Recently we developed a model of local gist vision: which types of objects are about where in a scene. This model addresses man-made objects which are dominated by a small shape repertoire: squares, rectangles, trapeziums, triangles, circles and ellipses. Only exploiting local colour contrast, the model can detect these shapes by a small hierarchy of cell layers devoted to low- and mid-level geometry. The model has been tested successfully on video sequences containing traffic signs and other scenes, and partial occlusions were not problematic.
Resumo:
A biological disparity energy model can estimate local depth information by using a population of V1 complex cells. Instead of applying an analytical model which explicitly involves cell parameters like spatial frequency, orientation, binocular phase and position difference, we developed a model which only involves the cells’ responses, such that disparity can be extracted from a population code, using only a set of previously trained cells with random-dot stereograms of uniform disparity. Despite good results in smooth regions, the model needs complementary processing, notably at depth transitions. We therefore introduce a new model to extract disparity at keypoints such as edge junctions, line endings and points with large curvature. Responses of end-stopped cells serve to detect keypoints, and those of simple cells are used to detect orientations of their underlying line and edge structures. Annotated keypoints are then used in the leftright matching process, with a hierarchical, multi-scale tree structure and a saliency map to segregate disparity. By combining both models we can (re)define depth transitions and regions where the disparity energy model is less accurate.
Resumo:
Increasingly more applications in computer vision employ interest points. Algorithms like SIFT and SURF are all based on partial derivatives of images smoothed with Gaussian filter kemels. These algorithrns are fast and therefore very popular.
Resumo:
Disparity energy models (DEMs) estimate local depth information on the basis ofVl complex cells. Our recent DEM (Martins et al, 2011 ISSPlT261-266) employs a population code. Once the population's cells have been trained with randorn-dot stereograms, it is applied at all retinotopic positions in the visual field. Despite producing good results in textured regions, the model needs to be made more precise, especially at depth transitions.
Resumo:
Human-robot interaction is an interdisciplinary research area which aims at integrating human factors, cognitive psychology and robot technology. The ultimate goal is the development of social robots. These robots are expected to work in human environments, and to understand behavior of persons through gestures and body movements. In this paper we present a biological and realtime framework for detecting and tracking hands. This framework is based on keypoints extracted from cortical V1 end-stopped cells. Detected keypoints and the cells’ responses are used to classify the junction type. By combining annotated keypoints in a hierarchical, multi-scale tree structure, moving and deformable hands can be segregated, their movements can be obtained, and they can be tracked over time. By using hand templates with keypoints at only two scales, a hand’s gestures can be recognized.