887 resultados para Visual Object Recognition


Relevância:

90.00% 90.00%

Publicador:

Resumo:

As we look around a scene, we perceive it as continuous and stable even though each saccadic eye movement changes the visual input to the retinas. How the brain achieves this perceptual stabilization is unknown, but a major hypothesis is that it relies on presaccadic remapping, a process in which neurons shift their visual sensitivity to a new location in the scene just before each saccade. This hypothesis is difficult to test in vivo because complete, selective inactivation of remapping is currently intractable. We tested it in silico with a hierarchical, sheet-based neural network model of the visual and oculomotor system. The model generated saccadic commands to move a video camera abruptly. Visual input from the camera and internal copies of the saccadic movement commands, or corollary discharge, converged at a map-level simulation of the frontal eye field (FEF), a primate brain area known to receive such inputs. FEF output was combined with eye position signals to yield a suitable coordinate frame for guiding arm movements of a robot. Our operational definition of perceptual stability was "useful stability,” quantified as continuously accurate pointing to a visual object despite camera saccades. During training, the emergence of useful stability was correlated tightly with the emergence of presaccadic remapping in the FEF. Remapping depended on corollary discharge but its timing was synchronized to the updating of eye position. When coupled to predictive eye position signals, remapping served to stabilize the target representation for continuously accurate pointing. Graded inactivations of pathways in the model replicated, and helped to interpret, previous in vivo experiments. The results support the hypothesis that visual stability requires presaccadic remapping, provide explanations for the function and timing of remapping, and offer testable hypotheses for in vivo studies. We conclude that remapping allows for seamless coordinate frame transformations and quick actions despite visual afferent lags. With visual remapping in place for behavior, it may be exploited for perceptual continuity.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Tese de dout., Engenharia Electrónica e de Computadores, Faculdade de Ciência e Tecnologia, Universidade do Algarve, 2007

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Ultrasonic, infrared, laser and other sensors are being applied in robotics. Although combinations of these have allowed robots to navigate, they are only suited for specific scenarios, depending on their limitations. Recent advances in computer vision are turning cameras into useful low-cost sensors that can operate in most types of environments. Cameras enable robots to detect obstacles, recognize objects, obtain visual odometry, detect and recognize people and gestures, among other possibilities. In this paper we present a completely biologically inspired vision system for robot navigation. It comprises stereo vision for obstacle detection, and object recognition for landmark-based navigation. We employ a novel keypoint descriptor which codes responses of cortical complex cells. We also present a biologically inspired saliency component, based on disparity and colour.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Neuroimaging studies of aesthetic appreciation have shown that activity in the lateral occipital area (LO)—a key node in the object recognition pathway—is modulated by the extent to which visual artworks are liked or found beautiful. However, the available evidence is only correlational. Here we used transcranial magnetic stimulation (TMS) to investigate the putative causal role of LO in the aesthetic appreciation of paintings. In our first experiment, we found that interfering with LO activity during aesthetic appreciation selectively reduced evaluation of representational paintings, leaving appreciation of abstract paintings unaffected. A second experiment demonstrated that, although the perceived clearness of the images overall positively correlated with liking, the detrimental effect of LO TMS on aesthetic appreciation does not owe to TMS reducing perceived clearness. Taken together, our findings suggest that object-recognition mechanisms mediated by LO play a causal role in aesthetic appreciation of representational art.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Normal visual perception requires differentiating foreground from background objects. Differences in physical attributes sometimes determine this relationship. Often such differences must instead be inferred, as when two objects or their parts have the same luminance. Modal completion refers to such perceptual "filling-in" of object borders that are accompanied by concurrent brightness enhancement, in turn termed illusory contours (ICs). Amodal completion is filling-in without concurrent brightness enhancement. Presently there are controversies regarding whether both completion processes use a common neural mechanism and whether perceptual filling-in is a bottom-up, feedforward process initiating at the lowest levels of the cortical visual pathway or commences at higher-tier regions. We previously examined modal completion (Murray et al., 2002) and provided evidence that the earliest modal IC sensitivity occurs within higher-tier object recognition areas of the lateral occipital complex (LOC). We further proposed that previous observations of IC sensitivity in lower-tier regions likely reflect feedback modulation from the LOC. The present study tested these proposals, examining the commonality between modal and amodal completion mechanisms with high-density electrical mapping, spatiotemporal topographic analyses, and the local autoregressive average distributed linear inverse source estimation. A common initial mechanism for both types of completion processes (140 msec) that manifested as a modulation in response strength within higher-tier visual areas, including the LOC and parietal structures, is demonstrated, whereas differential mechanisms were evident only at a subsequent time period (240 msec), with amodal completion relying on continued strong responses in these structures.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Psychopathy is associated with well-known characteristics such as a lack of empathy and impulsive behaviour, but it has also been associated with impaired recognition of emotional facial expressions. The use of event-related potentials (ERPs) to examine this phenomenon could shed light on the specific time course and neural activation associated with emotion recognition processes as they relate to psychopathic traits. In the current study we examined the PI , N170, and vertex positive potential (VPP) ERP components and behavioural performance with respect to scores on the Self-Report Psychopathy (SRP-III) questionnaire. Thirty undergraduates completed two tasks, the first of which required the recognition and categorization of affective face stimuli under varying presentation conditions. Happy, angry or fearful faces were presented under with attention directed to the mouth, nose or eye region and varied stimulus exposure duration (30, 75, or 150 ms). We found that behavioural performance to be unrelated to psychopathic personality traits in all conditions, but there was a trend for the Nl70 to peak later in response to fearful and happy facial expressions for individuals high in psychopathic traits. However, the amplitude of the VPP was significantly negatively associated with psychopathic traits, but only in response to stimuli presented under a nose-level fixation. Finally, psychopathic traits were found to be associated with longer N170 latencies in response to stimuli presented under the 30 ms exposure duration. In the second task, participants were required to inhibit processing of irrelevant affective and scrambled face distractors while categorizing unrelated word stimuli as living or nonliving. Psychopathic traits were hypothesized to be positively associated with behavioural performance, as it was proposed that individuals high in psychopathic traits would be less likely to automatically attend to task-irrelevant affective distractors, facilitating word categorization. Thus, decreased interference would be reflected in smaller N170 components, indicating less neural activity associated with processing of distractor faces. We found that overall performance decreased in the presence of angry and fearful distractor faces as psychopathic traits increased. In addition, the amplitude of the N170 decreased and the latency increased in response to affective distractor faces for individuals with higher levels of psychopathic traits. Although we failed to find the predicted behavioural deficit in emotion recognition in Task 1 and facilitation effect in Task 2, the findings of increased N170 and VPP latencies in response to emotional faces are consistent wi th the proposition that abnormal emotion recognition processes may in fact be inherent to psychopathy as a continuous personality trait.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

This lexical decision study with eye tracking of Japanese two-kanji-character words investigated the order in which a whole two-character word and its morphographic constituents are activated in the course of lexical access, the relative contributions of the left and the right characters in lexical decision, the depth to which semantic radicals are processed, and how nonlinguistic factors affect lexical processes. Mixed-effects regression analyses of response times and subgaze durations (i.e., first-pass fixation time spent on each of the two characters) revealed joint contributions of morphographic units at all levels of the linguistic structure with the magnitude and the direction of the lexical effects modulated by readers’ locus of attention in a left-to-right preferred processing path. During the early time frame, character effects were larger in magnitude and more robust than radical and whole-word effects, regardless of the font size and the type of nonwords. Extending previous radical-based and character-based models, we propose a task/decision-sensitive character-driven processing model with a level-skipping assumption: Connections from the feature level bypass the lower radical level and link up directly to the higher character level.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Les tâches de vision artificielle telles que la reconnaissance d’objets demeurent irrésolues à ce jour. Les algorithmes d’apprentissage tels que les Réseaux de Neurones Artificiels (RNA), représentent une approche prometteuse permettant d’apprendre des caractéristiques utiles pour ces tâches. Ce processus d’optimisation est néanmoins difficile. Les réseaux profonds à base de Machine de Boltzmann Restreintes (RBM) ont récemment été proposés afin de guider l’extraction de représentations intermédiaires, grâce à un algorithme d’apprentissage non-supervisé. Ce mémoire présente, par l’entremise de trois articles, des contributions à ce domaine de recherche. Le premier article traite de la RBM convolutionelle. L’usage de champs réceptifs locaux ainsi que le regroupement d’unités cachées en couches partageant les même paramètres, réduit considérablement le nombre de paramètres à apprendre et engendre des détecteurs de caractéristiques locaux et équivariant aux translations. Ceci mène à des modèles ayant une meilleure vraisemblance, comparativement aux RBMs entraînées sur des segments d’images. Le deuxième article est motivé par des découvertes récentes en neurosciences. Il analyse l’impact d’unités quadratiques sur des tâches de classification visuelles, ainsi que celui d’une nouvelle fonction d’activation. Nous observons que les RNAs à base d’unités quadratiques utilisant la fonction softsign, donnent de meilleures performances de généralisation. Le dernière article quand à lui, offre une vision critique des algorithmes populaires d’entraînement de RBMs. Nous montrons que l’algorithme de Divergence Contrastive (CD) et la CD Persistente ne sont pas robustes : tous deux nécessitent une surface d’énergie relativement plate afin que leur chaîne négative puisse mixer. La PCD à "poids rapides" contourne ce problème en perturbant légèrement le modèle, cependant, ceci génère des échantillons bruités. L’usage de chaînes tempérées dans la phase négative est une façon robuste d’adresser ces problèmes et mène à de meilleurs modèles génératifs.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

La reconnaissance d’objets est une tâche complexe au cours de laquelle le cerveau doit assembler de manière cohérente tous les éléments d’un objet accessible à l’œil afin de le reconnaître. La construction d’une représentation corticale de l’objet se fait selon un processus appelé « bottom-up », impliquant notamment les régions occipitales et temporales. Un mécanisme « top-down » au niveau des régions pariétales et frontales, facilite la reconnaissance en suggérant des identités potentielles de l’objet à reconnaître. Cependant, le mode de fonctionnement de ces mécanismes est peu connu. Plusieurs études ont démontré une activité gamma induite au moment de la perception cohérente de stimuli, lui conférant ainsi un rôle important dans la reconnaissance d’objets. Cependant, ces études ont utilisé des techniques d’enregistrement peu précises ainsi que des stimuli répétitifs. La première étude de cette thèse vise à décrire la dynamique spatio-temporelle de l’activité gamma induite à l’aide de l’électroencéphalographie intracrânienne, une technique qui possède des résolutions spatiales et temporelles des plus précises. Une tâche d’images fragmentées a été conçue dans le but de décrire l’activité gamma induite selon différents niveaux de reconnaissance, tout en évitant la répétition de stimuli déjà reconnus. Afin de mieux circonscrire les mécanismes « top-down », la tâche a été répétée après un délai de 24 heures. Les résultats démontrent une puissante activité gamma induite au moment de la reconnaissance dans les régions « bottom-up ». Quant aux mécanismes « top-down », l’activité était plus importante aux régions occipitopariétales. Après 24 heures, l’activité était davantage puissante aux régions frontales, suggérant une adaptation des procédés « top-down » selon les demandes de la tâche. Très peu d’études se sont intéressées au rythme alpha dans la reconnaissance d’objets, malgré qu’il soit bien reconnu pour son rôle dans l’attention, la mémoire et la communication des régions neuronales distantes. La seconde étude de cette thèse vise donc à décrire plus précisément l’implication du rythme alpha dans la reconnaissance d’objets en utilisant les techniques et tâches identiques à la première étude. Les analyses révèlent une puissante activité alpha se propageant des régions postérieures aux régions antérieures, non spécifique à la reconnaissance. Une synchronisation de la phase de l’alpha était, quant à elle, observable qu’au moment de la reconnaissance. Après 24 heures, un patron similaire était observable, mais l’amplitude de l’activité augmentait au niveau frontal et les synchronies de la phase étaient davantage distribuées. Le rythme alpha semble donc refléter des processus attentionnels et communicationnels dans la reconnaissance d’objets. En conclusion, cette thèse a permis de décrire avec précision la dynamique spatio-temporelle de l’activité gamma induite et du rythme alpha ainsi que d’en apprendre davantage sur les rôles potentiels que ces deux rythmes occupent dans la reconnaissance d’objets.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

La capacité du système visuel humain à compléter une image partiellement dévoilée et à en dériver une forme globale à partir de ses fragments visibles incomplets est un phénomène qui suscite, jusqu’à nos jours, l’intérêt de nombreux scientifiques œuvrant dans différents milieux de recherche tels que l’informatique, l’ingénierie en intelligence artificielle, la perception et les neurosciences. Dans le cadre de la présente thèse, nous nous sommes intéressés spécifiquement sur les substrats neuronaux associés à ce phénomène de clôture perceptive. La thèse actuelle a donc pour objectif général d’explorer le décours spatio-temporel des corrélats neuronaux associés à la clôture perceptive au cours d’une tâche d’identification d’objets. Dans un premier temps, le premier article visera à caractériser la signature électrophysiologique liée à la clôture perceptive chez des personnes à développement typique dans le but de déterminer si les processus de clôture perceptive reflèteraient l’interaction itérative entre les mécanismes de bas et de haut-niveau et si ceux-ci seraient sollicités à une étape précoce ou tardive lors du traitement visuel de l’information. Dans un deuxième temps, le second article a pour objectif d’explorer le décours spatio-temporel des mécanismes neuronaux sous-tendant la clôture perceptive dans le but de déterminer si les processus de clôture perceptive des personnes présentant un trouble autistique se caractérisent par une signature idiosyncrasique des changements d’amplitude des potentiels évoqués (PÉs). En d’autres termes, nous cherchons à déterminer si la clôture perceptive en autisme est atypique et nécessiterait davantage la contribution des mécanismes de bas-niveau et/ou de haut-niveau. Les résultats du premier article indiquent que le phénomène de clôture perceptive est associé temporellement à l’occurrence de la composante de PÉs N80 et P160 tel que révélé par des différences significatives claires entre des objets et des versions méconnaissables brouillées. Nous proposons enfin que la clôture perceptive s’avère un processus de transition reflétant les interactions proactives entre les mécanismes neuronaux œuvrant à apparier l’input sensoriel fragmenté à une représentation d’objets en mémoire plausible. Les résultats du second article révèlent des effets précoces de fragmentation et d’identification obtenus au niveau de composantes de potentiels évoqués N80 et P160 et ce, en toute absence d’effets au niveau des composantes tardives pour les individus avec autisme de haut niveau et avec syndrome d’Asperger. Pour ces deux groupes du trouble du spectre autistique, les données électrophysiologiques suggèrent qu’il n’y aurait pas de pré-activation graduelle de l’activité des régions corticales, entre autres frontales, aux moments précédant et menant vers l’identification d’objets fragmentés. Pour les participants autistes et avec syndrome d’Asperger, les analyses statistiques démontrent d’ailleurs une plus importante activation au niveau des régions postérieures alors que les individus à développement typique démontrent une activation plus élevée au niveau antérieur. Ces résultats pourraient suggérer que les personnes du spectre autistique se fient davantage aux processus perceptifs de bas-niveau pour parvenir à compléter les images d’objets fragmentés. Ainsi, lorsque confrontés aux images d’objets partiellement visibles pouvant sembler ambiguës, les individus avec autisme pourraient démontrer plus de difficultés à générer de multiples prédictions au sujet de l’identité d’un objet qu’ils perçoivent. Les implications théoriques et cliniques, les limites et perspectives futures de ces résultats sont discutées.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

As AI has begun to reach out beyond its symbolic, objectivist roots into the embodied, experientialist realm, many projects are exploring different aspects of creating machines which interact with and respond to the world as humans do. Techniques for visual processing, object recognition, emotional response, gesture production and recognition, etc., are necessary components of a complete humanoid robot. However, most projects invariably concentrate on developing a few of these individual components, neglecting the issue of how all of these pieces would eventually fit together. The focus of the work in this dissertation is on creating a framework into which such specific competencies can be embedded, in a way that they can interact with each other and build layers of new functionality. To be of any practical value, such a framework must satisfy the real-world constraints of functioning in real-time with noisy sensors and actuators. The humanoid robot Cog provides an unapologetically adequate platform from which to take on such a challenge. This work makes three contributions to embodied AI. First, it offers a general-purpose architecture for developing behavior-based systems distributed over networks of PC's. Second, it provides a motor-control system that simulates several biological features which impact the development of motor behavior. Third, it develops a framework for a system which enables a robot to learn new behaviors via interacting with itself and the outside world. A few basic functional modules are built into this framework, enough to demonstrate the robot learning some very simple behaviors taught by a human trainer. A primary motivation for this project is the notion that it is practically impossible to build an "intelligent" machine unless it is designed partly to build itself. This work is a proof-of-concept of such an approach to integrating multiple perceptual and motor systems into a complete learning agent.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Most psychophysical studies of object recognition have focussed on the recognition and representation of individual objects subjects had previously explicitely been trained on. Correspondingly, modeling studies have often employed a 'grandmother'-type representation where the objects to be recognized were represented by individual units. However, objects in the natural world are commonly members of a class containing a number of visually similar objects, such as faces, for which physiology studies have provided support for a representation based on a sparse population code, which permits generalization from the learned exemplars to novel objects of that class. In this paper, we present results from psychophysical and modeling studies intended to investigate object recognition in natural ('continuous') object classes. In two experiments, subjects were trained to perform subordinate level discrimination in a continuous object class - images of computer-rendered cars - created using a 3D morphing system. By comparing the recognition performance of trained and untrained subjects we could estimate the effects of viewpoint-specific training and infer properties of the object class-specific representation learned as a result of training. We then compared the experimental findings to simulations, building on our recently presented HMAX model of object recognition in cortex, to investigate the computational properties of a population-based object class representation as outlined above. We find experimental evidence, supported by modeling results, that training builds a viewpoint- and class-specific representation that supplements a pre-existing repre-sentation with lower shape discriminability but possibly greater viewpoint invariance.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Object recognition in the visual cortex is based on a hierarchical architecture, in which specialized brain regions along the ventral pathway extract object features of increasing levels of complexity, accompanied by greater invariance in stimulus size, position, and orientation. Recent theoretical studies postulate a non-linear pooling function, such as the maximum (MAX) operation could be fundamental in achieving such invariance. In this paper, we are concerned with neurally plausible mechanisms that may be involved in realizing the MAX operation. Four canonical circuits are proposed, each based on neural mechanisms that have been previously discussed in the context of cortical processing. Through simulations and mathematical analysis, we examine the relative performance and robustness of these mechanisms. We derive experimentally verifiable predictions for each circuit and discuss their respective physiological considerations.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Tsunoda et al. (2001) recently studied the nature of object representation in monkey inferotemporal cortex using a combination of optical imaging and extracellular recordings. In particular, they examined IT neuron responses to complex natural objects and "simplified" versions thereof. In that study, in 42% of the cases, optical imaging revealed a decrease in the number of activation patches in IT as stimuli were "simplified". However, in 58% of the cases, "simplification" of the stimuli actually led to the appearance of additional activation patches in IT. Based on these results, the authors propose a scheme in which an object is represented by combinations of active and inactive columns coding for individual features. We examine the patterns of activation caused by the same stimuli as used by Tsunoda et al. in our model of object recognition in cortex (Riesenhuber 99). We find that object-tuned units can show a pattern of appearance and disappearance of features identical to the experiment. Thus, the data of Tsunoda et al. appear to be in quantitative agreement with a simple object-based representation in which an object's identity is coded by its similarities to reference objects. Moreover, the agreement of simulations and experiment suggests that the simplification procedure used by Tsunoda (2001) is not necessarily an accurate method to determine neuronal tuning.