982 resultados para visual object categorization
Resumo:
The existence of hand-centred visual processing has long been established in the macaque premotor cortex. These hand-centred mechanisms have been thought to play some general role in the sensory guidance of movements towards objects, or, more recently, in the sensory guidance of object avoidance movements. We suggest that these hand-centred mechanisms play a specific and prominent role in the rapid selection and control of manual actions following sudden changes in the properties of the objects relevant for hand-object interactions. We discuss recent anatomical and physiological evidence from human and non-human primates, which indicates the existence of rapid processing of visual information for hand-object interactions. This new evidence demonstrates how several stages of the hierarchical visual processing system may be bypassed, feeding the motor system with hand-related visual inputs within just 70 ms following a sudden event. This time window is early enough, and this processing rapid enough, to allow the generation and control of rapid hand-centred avoidance and acquisitive actions, for aversive and desired objects, respectively
Resumo:
Perception and action are tightly linked: objects may be perceived not only in terms of visual features, but also in terms of possibilities for action. Previous studies showed that when a centrally located object has a salient graspable feature (e.g., a handle), it facilitates motor responses corresponding with the feature's position. However, such so-called affordance effects have been criticized as resulting from spatial compatibility effects, due to the visual asymmetry created by the graspable feature, irrespective of any affordances. In order to dissociate between affordance and spatial compatibility effects, we asked participants to perform a simple reaction-time task to typically graspable and non-graspable objects with similar visual features (e.g., lollipop and stop sign). Responses were measured using either electromyography (EMG) on proximal arm muscles during reaching-like movements, or with finger key-presses. In both EMG and button press measurements, participants responded faster when the object was either presented in the same location as the responding hand, or was affordable, resulting in significant and independent spatial compatibility and affordance effects, but no interaction. Furthermore, while the spatial compatibility effect was present from the earliest stages of movement preparation and throughout the different stages of movement execution, the affordance effect was restricted to the early stages of movement execution. Finally, we tested a small group of unilateral arm amputees using EMG, and found residual spatial compatibility but no affordance, suggesting that spatial compatibility effects do not necessarily rely on individuals’ available affordances. Our results show dissociation between affordance and spatial compatibility effects, and suggest that rather than evoking the specific motor action most suitable for interaction with the viewed object, graspable objects prompt the motor system in a general, body-part independent fashion
Resumo:
This work presents a method of information fusion involving data captured by both a standard CCD camera and a ToF camera to be used in the detection of the proximity between a manipulator robot and a human. Both cameras are assumed to be located above the work area of an industrial robot. The fusion of colour images and time of light information makes it possible to know the 3D localization of objects with respect to a world coordinate system. At the same time this allows to know their colour information. Considering that ToF information given by the range camera contains innacuracies including distance error, border error, and pixel saturation, some corrections over the ToF information are proposed and developed to improve the results. The proposed fusion method uses the calibration parameters of both cameras to reproject 3D ToF points, expressed in a common coordinate system for both cameras and a robot arm, in 2D colour images. In addition to this, using the 3D information, the motion detection in a robot industrial environment is achieved, and the fusion of information is applied to the foreground objects previously detected. This combination of information results in a matrix that links colour and 3D information, giving the possibility of characterising the object by its colour in addition to its 3D localization. Further development of these methods will make it possible to identify objects and their position in the real world, and to use this information to prevent possible collisions between the robot and such objects.
Resumo:
This work presents a method of information fusion involving data captured by both a standard charge-coupled device (CCD) camera and a time-of-flight (ToF) camera to be used in the detection of the proximity between a manipulator robot and a human. Both cameras are assumed to be located above the work area of an industrial robot. The fusion of colour images and time-of-flight information makes it possible to know the 3D localization of objects with respect to a world coordinate system. At the same time, this allows to know their colour information. Considering that ToF information given by the range camera contains innacuracies including distance error, border error, and pixel saturation, some corrections over the ToF information are proposed and developed to improve the results. The proposed fusion method uses the calibration parameters of both cameras to reproject 3D ToF points, expressed in a common coordinate system for both cameras and a robot arm, in 2D colour images. In addition to this, using the 3D information, the motion detection in a robot industrial environment is achieved, and the fusion of information is applied to the foreground objects previously detected. This combination of information results in a matrix that links colour and 3D information, giving the possibility of characterising the object by its colour in addition to its 3D localisation. Further development of these methods will make it possible to identify objects and their position in the real world and to use this information to prevent possible collisions between the robot and such objects.
Resumo:
Human observers exhibit large systematic distance-dependent biases when estimating the three-dimensional (3D) shape of objects defined by binocular image disparities. This has led some to question the utility of disparity as a cue to 3D shape and whether accurate estimation of 3D shape is at all possible. Others have argued that accurate perception is possible, but only with large continuous perspective transformations of an object. Using a stimulus that is known to elicit large distance-dependent perceptual bias (random dot stereograms of elliptical cylinders) we show that contrary to these findings the simple adoption of a more naturalistic viewing angle completely eliminates this bias. Using behavioural psychophysics, coupled with a novel surface-based reverse correlation methodology, we show that it is binocular edge and contour information that allows for accurate and precise perception and that observers actively exploit and sample this information when it is available.
Resumo:
There is evidence that automatic visual attention favors the right side. This study investigated whether this lateral asymmetry interacts with the right hemisphere dominance for visual location processing and left hemisphere dominance for visual shape processing. Volunteers were tested in a location discrimination task and a shape discrimination task. The target stimuli (S2) could occur in the left or right hemifield. They were preceded by an ipsilateral, contralateral or bilateral prime stimulus (S1). The attentional effect produced by the right S1 was larger than that produced by the left S1. This lateral asymmetry was similar between the two tasks suggesting that the hemispheric asymmetries of visual mechanisms do not contribute to it. The finding that it was basically due to a longer reaction time to the left S2 than to the right S2 for the contralateral S1 condition suggests that the inhibitory component of attention is laterally asymmetric.
Resumo:
Object selection refers to the mechanism of extracting objects of interest while ignoring other objects and background in a given visual scene. It is a fundamental issue for many computer vision and image analysis techniques and it is still a challenging task to artificial Visual systems. Chaotic phase synchronization takes place in cases involving almost identical dynamical systems and it means that the phase difference between the systems is kept bounded over the time, while their amplitudes remain chaotic and may be uncorrelated. Instead of complete synchronization, phase synchronization is believed to be a mechanism for neural integration in brain. In this paper, an object selection model is proposed. Oscillators in the network representing the salient object in a given scene are phase synchronized, while no phase synchronization occurs for background objects. In this way, the salient object can be extracted. In this model, a shift mechanism is also introduced to change attention from one object to another. Computer simulations show that the model produces some results similar to those observed in natural vision systems.
Resumo:
Biological systems have facility to capture salient object(s) in a given scene, but it is still a difficult task to be accomplished by artificial vision systems. In this paper a visual selection mechanism based on the integrate and fire neural network is proposed. The model not only can discriminate objects in a given visual scene, but also can deliver focus of attention to the salient object. Moreover, it processes a combination of relevant features of an input scene, such as intensity, color, orientation, and the contrast of them. In comparison to other visual selection approaches, this model presents several interesting features. It is able to capture attention of objects in complex forms, including those linearly nonseparable. Moreover, computer simulations show that the model produces results similar to those observed in natural vision systems.
Resumo:
The issue of how children learn the meaning of words is fundamental to developmental psychology. The recent attempts to develop or evolve efficient communication protocols among interacting robots or Virtual agents have brought that issue to a central place in more applied research fields, such as computational linguistics and neural networks, as well. An attractive approach to learning an object-word mapping is the so-called cross-situational learning. This learning scenario is based on the intuitive notion that a learner can determine the meaning of a word by finding something in common across all observed uses of that word. Here we show how the deterministic Neural Modeling Fields (NMF) categorization mechanism can be used by the learner as an efficient algorithm to infer the correct object-word mapping. To achieve that we first reduce the original on-line learning problem to a batch learning problem where the inputs to the NMF mechanism are all possible object-word associations that Could be inferred from the cross-situational learning scenario. Since many of those associations are incorrect, they are considered as clutter or noise and discarded automatically by a clutter detector model included in our NMF implementation. With these two key ingredients - batch learning and clutter detection - the NMF mechanism was capable to infer perfectly the correct object-word mapping. (C) 2009 Elsevier Ltd. All rights reserved.
Resumo:
O desenvolvimento de artefatos de software é um processo de engenharia, como todo processo de engenharia, envolve uma série de etapas que devem ser conduzidas através de uma metodologia apropriada. Para que um determinado software alcance seus objetivos, as características conceituais e arquiteturais devem ser bem definidas antes da implementação. Aplicações baseadas em hiperdocumentos possuem uma característica específica que é a definição de seus aspectos navegacionais. A navegação é uma etapa crítica no processo de definição de softwares baseados em hiperdocumentos, pois ela conduz o usuário durante uma sessão de visita ao conteúdo de um site. Uma falha no processo de especificação da navegação causa uma perda de contexto, desorientando o usuário no espaço da aplicação. Existem diversas metodologias para o tratamento das características de navegação de aplicações baseadas em hiperdocumentos. As principais metodologias encontradas na literatura foram estudadas e analisadas neste trabalho. Foi realizada uma análise comparativa entre as metodologias, traçando suas abordagens e etapas. O estudo das abordagens de especificação de hiperdocumentos foi uma etapa preliminar servindo como base de estudo para o objetivo deste trabalho. O foco é a construção de uma ferramenta gráfica de especificação conceitual de hiperdocumentos, segundo uma metodologia de modelagem de software baseado em hiperdocumentos. O método adotado foi o OOHDM (Object-Oriented Hypermedia Design Model), por cercar todas as etapas de um processo de desenvolvimento de aplicações, com uma atenção particular à navegação. A ferramenta implementa uma interface gráfica onde o usuário poderá modelar a aplicação através da criação de modelos. O processo de especificação compreende três modelos: modelagem conceitual, modelagem navegacional e de interface. As características da aplicação são definidas em um processo incremental, que começa na definição conceitual e finaliza nas características de interface. A ferramenta gera um protótipo da aplicação em XML. Para a apresentação das páginas em um navegador Web, utilizou-se XSLT para a conversão das informações no formato XML para HTML. Os modelos criados através das etapas de especificação abstrata da aplicação são exportados em OOHDM-ML. Um estudo de caso foi implementado para validação da ferramenta. Como principal contribuição deste trabalho, pode-se citar a construção de um ambiente gráfico de especificação abstrata de hiperdocumentos e um ambiente de implementação de protótipos e exportação de modelos. Com isso, pretende-se orientar, conduzir e disciplinar o trabalho do usuário durante o processo de especificação de aplicações.
Resumo:
The scope of this study directs an investigation in search of how the blind person learns knowledge at school mediated by the image in context of an inclusive education and how it can be (or is) triggered by the adaptation of images to the tactile seizure of the blind person and his correlative process of reading. To achieve this intent we choose a qualitative approach of research and opted for the modality of case study, based on the empirical field of a public school in the city of Cruzeta, RN and as a the main subject a congenitally blind female student enrolled in high school there, focusing, often, on the discipline of geography in its words mapping. Our procedures for construction of data are directly involved to the documentary analysis of open reflective interview and observation. The base guiding theory of our assessments is located in the current understanding about the human psychological development of its educational process inside an inclusive perspective, of contemporary conceptions about the visual disability as well of image as a cultural product. Accordingly, the human person is a concrete subject, whose development is deeply marked by the culture, historically built by human society. This subject regardless of his specific features, grasping the world in an interactive and immediate way, internalising and producing culture. In this thinking, we believe that the blind person perceives in multiple senses the stimuli of his environment and acts in the world toward his integration into the social environment. The image as a product of culture, historically and socially determined, appears as a sign conventionally used as an icon that in itself concentrates knowledge of which the student who does not realize visually himself and his surroundings cannot be excluded. In this direction, the inclusive educational process must build conditions of access to knowledge for all students without distinction, including access to the interpretation of the images originally intended for the seizure strictly visual to other perceptive models. Based in this theory and adopting principles of content analysis, we circulated inside the interpretation of the data constructed from the analysis of documents, from the subject speeches, from records of the observation made in the classroom and other notes of the field daily. In the search for pictures on the school contents, adapted to the tactile seizure of blind student, was seen little and not systematic in practice and teaching at the school. It showed us the itinerary of the student life marked by a succession of supports, most of the time inappropriate and pioneers in cooling the construction of her autonomy. It also showed us the tensions and contradictions of a school environment, supposedly inclusive, that stumbles in search of its intent, in the attitudinal and cumulative barriers brought, because of its aggravating maintenance. These findings arose of crossing data around of a categorization that gives importance to 1) Concepts regarding the school inclusion, 2) Elements of the school organization, educational proposal and teaching practice, 3) Meaning of the visual image as the object of knowledge, 4) Perception in multiple senses and 5) Development and learning of the blind person before impositions of the social environment. In light of these findings we infer that it must be guaranteed to the disabled person removal of the attitudinal barriers that are against his full development and the construction of his autonomy. In that sense, should be given opportunity to the student with visual disability, similarly to all students, not only access to school, but also the dynamics of a school life efficient, that means the seizure of knowledge in all its modalities, including the imagery. To that end, there is a need of the continued training of teachers, construction of a support network in response to all needs of students, and the opportunity to development of reading skills beyond a perspective eminently focused in the sight
Resumo:
Visual attention is a very important task in autonomous robotics, but, because of its complexity, the processing time required is significant. We propose an architecture for feature selection using foveated images that is guided by visual attention tasks and that reduces the processing time required to perform these tasks. Our system can be applied in bottom-up or top-down visual attention. The foveated model determines which scales are to be used on the feature extraction algorithm. The system is able to discard features that are not extremely necessary for the tasks, thus, reducing the processing time. If the fovea is correctly placed, then it is possible to reduce the processing time without compromising the quality of the tasks outputs. The distance of the fovea from the object is also analyzed. If the visual system loses the tracking in top-down attention, basic strategies of fovea placement can be applied. Experiments have shown that it is possible to reduce up to 60% the processing time with this approach. To validate the method, we tested it with the feature algorithm known as Speeded Up Robust Features (SURF), one of the most efficient approaches for feature extraction. With the proposed architecture, we can accomplish real time requirements of robotics vision, mainly to be applied in autonomous robotics
Resumo:
This work uses computer vision algorithms related to features in the identification of medicine boxes for the visually impaired. The system is for people who have a disease that compromises his vision, hindering the identification of the correct medicine to be ingested. We use the camera, available in several popular devices such as computers, televisions and phones, to identify the box of the correct medicine and audio through the image, showing the poor information about the medication, such: as the dosage, indication and contraindications of the medication. We utilize a model of object detection using algorithms to identify the features in the boxes of drugs and playing the audio at the time of detection of feauteres in those boxes. Experiments carried out with 15 people show that where 93 % think that the system is useful and very helpful in identifying drugs for boxes. So, it is necessary to make use of this technology to help several people with visual impairments to take the right medicine, at the time indicated in advance by the physician
Resumo:
Desde os descobrimentos pioneiros de Hubel e Wiesel acumulou-se uma vasta literatura descrevendo as respostas neuronais do córtex visual primário (V1) a diferentes estímulos visuais. Estes estímulos consistem principalmente em barras em movimento, pontos ou grades, que são úteis para explorar as respostas dentro do campo receptivo clássico (CRF do inglês classical receptive field) a características básicas dos estímulos visuais como a orientação, direção de movimento, contraste, entre outras. Entretanto, nas últimas duas décadas, tornou-se cada vez mais evidente que a atividade de neurônios em V1 pode ser modulada por estímulos fora do CRF. Desta forma, áreas visuais primárias poderiam estar envolvidas em funções visuais mais complexas como, por exemplo, a separação de um objeto ou figura do seu fundo (segregação figura-fundo) e assume-se que as conexões intrínsecas de longo alcance em V1, assim como as conexões de áreas visuais superiores, estão ativamente envolvidas neste processo. Sua possível função foi inferida a partir da análise das variações das respostas induzidas por um estímulo localizado fora do CRF de neurônios individuais. Mesmo sendo muito provável que estas conexões tenham também um impacto tanto na atividade conjunta de neurônios envolvidos no processamento da figura quanto no potencial de campo, estas questões permanecem pouco estudadas. Visando examinar a modulação do contexto visual nessas atividades, coletamos potenciais de ação e potenciais de campo em paralelo de até 48 eletrodos implantados na área visual primária de gatos anestesiados. Estimulamos com grades compostas e cenas naturais, focando-nos na atividade de neurônios cujo CRF estava situado na figura. Da mesma forma, visando examinar a influência das conexões laterais, o sinal proveniente da área visual isotópica e contralateral foi removido através da desativação reversível por resfriamento. Fizemos isso devido a: i) as conexões laterais intrínsecas não podem ser facilmente manipuladas sem afetar diretamente os sinais que estão sendo medidos, ii) as conexões inter-hemisféricas compartilham as principais características anatômicas com a rede lateral intrínseca e podem ser vistas como uma continuação funcional das mesmas entre os dois hemisférios e iii) o resfriamento desativa as conexões de forma causal e reversível, silenciando temporariamente seu sinal, permitindo conclusões diretas a respeito da sua contribuição. Nossos resultados demonstram que o mecanismo de segmentação figurafundo se reflete nas taxas de disparo de neurônios individuais, assim como na potência do potencial de campo e na relação entre sua fase e os padrões de disparo produzidos pela população. Além disso, as conexões laterais inter-hemisféricas modulam estas variáveis dependendo da estimulação feita fora do CRF. Observamos também uma influência deste circuito lateral na coerência entre potenciais de campo entre eletrodos distantes. Em conclusão, nossos resultados dão suporte à ideia de um mecanismo complexo de segmentação figura-fundo atuando desde as áreas visuais primárias em diferentes escalas de frequência. Esse mecanismo parece envolver grupos de neurônios ativos sincronicamente e dependentes da fase do potencial de campo. Nossos resultados também são compatíveis com a hipótese que conexões laterais de longo alcance também fazem parte deste mecanismo
Resumo:
The goals of this study were to examine the visual information influence on body sway as a function of self- and object-motion perception and visual information quality. Participants that were aware (object-motion) and unaware (self-motion) of the movement of a moving room were asked to stand upright at five different distances from its frontal wall. The visual information effect on body sway decreased when participants were aware about the sensory manipulation. Moreover, while the visual influence on body sway decreased as the distance increased in the self-motion perception, no effects were observed in the object-motion mode. The overall results indicate that postural control system functioning can be altered by prior knowledge, and adaptation due to changes in sensory quality seem to occur in the self- but not in the object-motion perception mode. (C) 2004 Elsevier B.V.. All rights reserved.