951 resultados para Visual Object Recognition


Relevância:

90.00% 90.00%

Publicador:

Resumo:

Psychopathy is associated with well-known characteristics such as a lack of empathy and impulsive behaviour, but it has also been associated with impaired recognition of emotional facial expressions. The use of event-related potentials (ERPs) to examine this phenomenon could shed light on the specific time course and neural activation associated with emotion recognition processes as they relate to psychopathic traits. In the current study we examined the PI , N170, and vertex positive potential (VPP) ERP components and behavioural performance with respect to scores on the Self-Report Psychopathy (SRP-III) questionnaire. Thirty undergraduates completed two tasks, the first of which required the recognition and categorization of affective face stimuli under varying presentation conditions. Happy, angry or fearful faces were presented under with attention directed to the mouth, nose or eye region and varied stimulus exposure duration (30, 75, or 150 ms). We found that behavioural performance to be unrelated to psychopathic personality traits in all conditions, but there was a trend for the Nl70 to peak later in response to fearful and happy facial expressions for individuals high in psychopathic traits. However, the amplitude of the VPP was significantly negatively associated with psychopathic traits, but only in response to stimuli presented under a nose-level fixation. Finally, psychopathic traits were found to be associated with longer N170 latencies in response to stimuli presented under the 30 ms exposure duration. In the second task, participants were required to inhibit processing of irrelevant affective and scrambled face distractors while categorizing unrelated word stimuli as living or nonliving. Psychopathic traits were hypothesized to be positively associated with behavioural performance, as it was proposed that individuals high in psychopathic traits would be less likely to automatically attend to task-irrelevant affective distractors, facilitating word categorization. Thus, decreased interference would be reflected in smaller N170 components, indicating less neural activity associated with processing of distractor faces. We found that overall performance decreased in the presence of angry and fearful distractor faces as psychopathic traits increased. In addition, the amplitude of the N170 decreased and the latency increased in response to affective distractor faces for individuals with higher levels of psychopathic traits. Although we failed to find the predicted behavioural deficit in emotion recognition in Task 1 and facilitation effect in Task 2, the findings of increased N170 and VPP latencies in response to emotional faces are consistent wi th the proposition that abnormal emotion recognition processes may in fact be inherent to psychopathy as a continuous personality trait.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

This lexical decision study with eye tracking of Japanese two-kanji-character words investigated the order in which a whole two-character word and its morphographic constituents are activated in the course of lexical access, the relative contributions of the left and the right characters in lexical decision, the depth to which semantic radicals are processed, and how nonlinguistic factors affect lexical processes. Mixed-effects regression analyses of response times and subgaze durations (i.e., first-pass fixation time spent on each of the two characters) revealed joint contributions of morphographic units at all levels of the linguistic structure with the magnitude and the direction of the lexical effects modulated by readers’ locus of attention in a left-to-right preferred processing path. During the early time frame, character effects were larger in magnitude and more robust than radical and whole-word effects, regardless of the font size and the type of nonwords. Extending previous radical-based and character-based models, we propose a task/decision-sensitive character-driven processing model with a level-skipping assumption: Connections from the feature level bypass the lower radical level and link up directly to the higher character level.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Les tâches de vision artificielle telles que la reconnaissance d’objets demeurent irrésolues à ce jour. Les algorithmes d’apprentissage tels que les Réseaux de Neurones Artificiels (RNA), représentent une approche prometteuse permettant d’apprendre des caractéristiques utiles pour ces tâches. Ce processus d’optimisation est néanmoins difficile. Les réseaux profonds à base de Machine de Boltzmann Restreintes (RBM) ont récemment été proposés afin de guider l’extraction de représentations intermédiaires, grâce à un algorithme d’apprentissage non-supervisé. Ce mémoire présente, par l’entremise de trois articles, des contributions à ce domaine de recherche. Le premier article traite de la RBM convolutionelle. L’usage de champs réceptifs locaux ainsi que le regroupement d’unités cachées en couches partageant les même paramètres, réduit considérablement le nombre de paramètres à apprendre et engendre des détecteurs de caractéristiques locaux et équivariant aux translations. Ceci mène à des modèles ayant une meilleure vraisemblance, comparativement aux RBMs entraînées sur des segments d’images. Le deuxième article est motivé par des découvertes récentes en neurosciences. Il analyse l’impact d’unités quadratiques sur des tâches de classification visuelles, ainsi que celui d’une nouvelle fonction d’activation. Nous observons que les RNAs à base d’unités quadratiques utilisant la fonction softsign, donnent de meilleures performances de généralisation. Le dernière article quand à lui, offre une vision critique des algorithmes populaires d’entraînement de RBMs. Nous montrons que l’algorithme de Divergence Contrastive (CD) et la CD Persistente ne sont pas robustes : tous deux nécessitent une surface d’énergie relativement plate afin que leur chaîne négative puisse mixer. La PCD à "poids rapides" contourne ce problème en perturbant légèrement le modèle, cependant, ceci génère des échantillons bruités. L’usage de chaînes tempérées dans la phase négative est une façon robuste d’adresser ces problèmes et mène à de meilleurs modèles génératifs.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

La reconnaissance d’objets est une tâche complexe au cours de laquelle le cerveau doit assembler de manière cohérente tous les éléments d’un objet accessible à l’œil afin de le reconnaître. La construction d’une représentation corticale de l’objet se fait selon un processus appelé « bottom-up », impliquant notamment les régions occipitales et temporales. Un mécanisme « top-down » au niveau des régions pariétales et frontales, facilite la reconnaissance en suggérant des identités potentielles de l’objet à reconnaître. Cependant, le mode de fonctionnement de ces mécanismes est peu connu. Plusieurs études ont démontré une activité gamma induite au moment de la perception cohérente de stimuli, lui conférant ainsi un rôle important dans la reconnaissance d’objets. Cependant, ces études ont utilisé des techniques d’enregistrement peu précises ainsi que des stimuli répétitifs. La première étude de cette thèse vise à décrire la dynamique spatio-temporelle de l’activité gamma induite à l’aide de l’électroencéphalographie intracrânienne, une technique qui possède des résolutions spatiales et temporelles des plus précises. Une tâche d’images fragmentées a été conçue dans le but de décrire l’activité gamma induite selon différents niveaux de reconnaissance, tout en évitant la répétition de stimuli déjà reconnus. Afin de mieux circonscrire les mécanismes « top-down », la tâche a été répétée après un délai de 24 heures. Les résultats démontrent une puissante activité gamma induite au moment de la reconnaissance dans les régions « bottom-up ». Quant aux mécanismes « top-down », l’activité était plus importante aux régions occipitopariétales. Après 24 heures, l’activité était davantage puissante aux régions frontales, suggérant une adaptation des procédés « top-down » selon les demandes de la tâche. Très peu d’études se sont intéressées au rythme alpha dans la reconnaissance d’objets, malgré qu’il soit bien reconnu pour son rôle dans l’attention, la mémoire et la communication des régions neuronales distantes. La seconde étude de cette thèse vise donc à décrire plus précisément l’implication du rythme alpha dans la reconnaissance d’objets en utilisant les techniques et tâches identiques à la première étude. Les analyses révèlent une puissante activité alpha se propageant des régions postérieures aux régions antérieures, non spécifique à la reconnaissance. Une synchronisation de la phase de l’alpha était, quant à elle, observable qu’au moment de la reconnaissance. Après 24 heures, un patron similaire était observable, mais l’amplitude de l’activité augmentait au niveau frontal et les synchronies de la phase étaient davantage distribuées. Le rythme alpha semble donc refléter des processus attentionnels et communicationnels dans la reconnaissance d’objets. En conclusion, cette thèse a permis de décrire avec précision la dynamique spatio-temporelle de l’activité gamma induite et du rythme alpha ainsi que d’en apprendre davantage sur les rôles potentiels que ces deux rythmes occupent dans la reconnaissance d’objets.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

La capacité du système visuel humain à compléter une image partiellement dévoilée et à en dériver une forme globale à partir de ses fragments visibles incomplets est un phénomène qui suscite, jusqu’à nos jours, l’intérêt de nombreux scientifiques œuvrant dans différents milieux de recherche tels que l’informatique, l’ingénierie en intelligence artificielle, la perception et les neurosciences. Dans le cadre de la présente thèse, nous nous sommes intéressés spécifiquement sur les substrats neuronaux associés à ce phénomène de clôture perceptive. La thèse actuelle a donc pour objectif général d’explorer le décours spatio-temporel des corrélats neuronaux associés à la clôture perceptive au cours d’une tâche d’identification d’objets. Dans un premier temps, le premier article visera à caractériser la signature électrophysiologique liée à la clôture perceptive chez des personnes à développement typique dans le but de déterminer si les processus de clôture perceptive reflèteraient l’interaction itérative entre les mécanismes de bas et de haut-niveau et si ceux-ci seraient sollicités à une étape précoce ou tardive lors du traitement visuel de l’information. Dans un deuxième temps, le second article a pour objectif d’explorer le décours spatio-temporel des mécanismes neuronaux sous-tendant la clôture perceptive dans le but de déterminer si les processus de clôture perceptive des personnes présentant un trouble autistique se caractérisent par une signature idiosyncrasique des changements d’amplitude des potentiels évoqués (PÉs). En d’autres termes, nous cherchons à déterminer si la clôture perceptive en autisme est atypique et nécessiterait davantage la contribution des mécanismes de bas-niveau et/ou de haut-niveau. Les résultats du premier article indiquent que le phénomène de clôture perceptive est associé temporellement à l’occurrence de la composante de PÉs N80 et P160 tel que révélé par des différences significatives claires entre des objets et des versions méconnaissables brouillées. Nous proposons enfin que la clôture perceptive s’avère un processus de transition reflétant les interactions proactives entre les mécanismes neuronaux œuvrant à apparier l’input sensoriel fragmenté à une représentation d’objets en mémoire plausible. Les résultats du second article révèlent des effets précoces de fragmentation et d’identification obtenus au niveau de composantes de potentiels évoqués N80 et P160 et ce, en toute absence d’effets au niveau des composantes tardives pour les individus avec autisme de haut niveau et avec syndrome d’Asperger. Pour ces deux groupes du trouble du spectre autistique, les données électrophysiologiques suggèrent qu’il n’y aurait pas de pré-activation graduelle de l’activité des régions corticales, entre autres frontales, aux moments précédant et menant vers l’identification d’objets fragmentés. Pour les participants autistes et avec syndrome d’Asperger, les analyses statistiques démontrent d’ailleurs une plus importante activation au niveau des régions postérieures alors que les individus à développement typique démontrent une activation plus élevée au niveau antérieur. Ces résultats pourraient suggérer que les personnes du spectre autistique se fient davantage aux processus perceptifs de bas-niveau pour parvenir à compléter les images d’objets fragmentés. Ainsi, lorsque confrontés aux images d’objets partiellement visibles pouvant sembler ambiguës, les individus avec autisme pourraient démontrer plus de difficultés à générer de multiples prédictions au sujet de l’identité d’un objet qu’ils perçoivent. Les implications théoriques et cliniques, les limites et perspectives futures de ces résultats sont discutées.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

As AI has begun to reach out beyond its symbolic, objectivist roots into the embodied, experientialist realm, many projects are exploring different aspects of creating machines which interact with and respond to the world as humans do. Techniques for visual processing, object recognition, emotional response, gesture production and recognition, etc., are necessary components of a complete humanoid robot. However, most projects invariably concentrate on developing a few of these individual components, neglecting the issue of how all of these pieces would eventually fit together. The focus of the work in this dissertation is on creating a framework into which such specific competencies can be embedded, in a way that they can interact with each other and build layers of new functionality. To be of any practical value, such a framework must satisfy the real-world constraints of functioning in real-time with noisy sensors and actuators. The humanoid robot Cog provides an unapologetically adequate platform from which to take on such a challenge. This work makes three contributions to embodied AI. First, it offers a general-purpose architecture for developing behavior-based systems distributed over networks of PC's. Second, it provides a motor-control system that simulates several biological features which impact the development of motor behavior. Third, it develops a framework for a system which enables a robot to learn new behaviors via interacting with itself and the outside world. A few basic functional modules are built into this framework, enough to demonstrate the robot learning some very simple behaviors taught by a human trainer. A primary motivation for this project is the notion that it is practically impossible to build an "intelligent" machine unless it is designed partly to build itself. This work is a proof-of-concept of such an approach to integrating multiple perceptual and motor systems into a complete learning agent.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Most psychophysical studies of object recognition have focussed on the recognition and representation of individual objects subjects had previously explicitely been trained on. Correspondingly, modeling studies have often employed a 'grandmother'-type representation where the objects to be recognized were represented by individual units. However, objects in the natural world are commonly members of a class containing a number of visually similar objects, such as faces, for which physiology studies have provided support for a representation based on a sparse population code, which permits generalization from the learned exemplars to novel objects of that class. In this paper, we present results from psychophysical and modeling studies intended to investigate object recognition in natural ('continuous') object classes. In two experiments, subjects were trained to perform subordinate level discrimination in a continuous object class - images of computer-rendered cars - created using a 3D morphing system. By comparing the recognition performance of trained and untrained subjects we could estimate the effects of viewpoint-specific training and infer properties of the object class-specific representation learned as a result of training. We then compared the experimental findings to simulations, building on our recently presented HMAX model of object recognition in cortex, to investigate the computational properties of a population-based object class representation as outlined above. We find experimental evidence, supported by modeling results, that training builds a viewpoint- and class-specific representation that supplements a pre-existing repre-sentation with lower shape discriminability but possibly greater viewpoint invariance.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Object recognition in the visual cortex is based on a hierarchical architecture, in which specialized brain regions along the ventral pathway extract object features of increasing levels of complexity, accompanied by greater invariance in stimulus size, position, and orientation. Recent theoretical studies postulate a non-linear pooling function, such as the maximum (MAX) operation could be fundamental in achieving such invariance. In this paper, we are concerned with neurally plausible mechanisms that may be involved in realizing the MAX operation. Four canonical circuits are proposed, each based on neural mechanisms that have been previously discussed in the context of cortical processing. Through simulations and mathematical analysis, we examine the relative performance and robustness of these mechanisms. We derive experimentally verifiable predictions for each circuit and discuss their respective physiological considerations.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Tsunoda et al. (2001) recently studied the nature of object representation in monkey inferotemporal cortex using a combination of optical imaging and extracellular recordings. In particular, they examined IT neuron responses to complex natural objects and "simplified" versions thereof. In that study, in 42% of the cases, optical imaging revealed a decrease in the number of activation patches in IT as stimuli were "simplified". However, in 58% of the cases, "simplification" of the stimuli actually led to the appearance of additional activation patches in IT. Based on these results, the authors propose a scheme in which an object is represented by combinations of active and inactive columns coding for individual features. We examine the patterns of activation caused by the same stimuli as used by Tsunoda et al. in our model of object recognition in cortex (Riesenhuber 99). We find that object-tuned units can show a pattern of appearance and disappearance of features identical to the experiment. Thus, the data of Tsunoda et al. appear to be in quantitative agreement with a simple object-based representation in which an object's identity is coded by its similarities to reference objects. Moreover, the agreement of simulations and experiment suggests that the simplification procedure used by Tsunoda (2001) is not necessarily an accurate method to determine neuronal tuning.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The question of how shape is represented is of central interest to understanding visual processing in cortex. While tuning properties of the cells in early part of the ventral visual stream, thought to be responsible for object recognition in the primate, are comparatively well understood, several different theories have been proposed regarding tuning in higher visual areas, such as V4. We used the model of object recognition in cortex presented by Riesenhuber and Poggio (1999), where more complex shape tuning in higher layers is the result of combining afferent inputs tuned to simpler features, and compared the tuning properties of model units in intermediate layers to those of V4 neurons from the literature. In particular, we investigated the issue of shape representation in visual area V1 and V4 using oriented bars and various types of gratings (polar, hyperbolic, and Cartesian), as used in several physiology experiments. Our computational model was able to reproduce several physiological findings, such as the broadening distribution of the orientation bandwidths and the emergence of a bias toward non-Cartesian stimuli. Interestingly, the simulation results suggest that some V4 neurons receive input from afferents with spatially separated receptive fields, leading to experimentally testable predictions. However, the simulations also show that the stimulus set of Cartesian and non-Cartesian gratings is not sufficiently complex to probe shape tuning in higher areas, necessitating the use of more complex stimulus sets.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

We present a component-based approach for recognizing objects under large pose changes. From a set of training images of a given object we extract a large number of components which are clustered based on the similarity of their image features and their locations within the object image. The cluster centers build an initial set of component templates from which we select a subset for the final recognizer. In experiments we evaluate different sizes and types of components and three standard techniques for component selection. The component classifiers are finally compared to global classifiers on a database of four objects.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Understanding how the human visual system recognizes objects is one of the key challenges in neuroscience. Inspired by a large body of physiological evidence (Felleman and Van Essen, 1991; Hubel and Wiesel, 1962; Livingstone and Hubel, 1988; Tso et al., 2001; Zeki, 1993), a general class of recognition models has emerged which is based on a hierarchical organization of visual processing, with succeeding stages being sensitive to image features of increasing complexity (Hummel and Biederman, 1992; Riesenhuber and Poggio, 1999; Selfridge, 1959). However, these models appear to be incompatible with some well-known psychophysical results. Prominent among these are experiments investigating recognition impairments caused by vertical inversion of images, especially those of faces. It has been reported that faces that differ "featurally" are much easier to distinguish when inverted than those that differ "configurally" (Freire et al., 2000; Le Grand et al., 2001; Mondloch et al., 2002) ??finding that is difficult to reconcile with the aforementioned models. Here we show that after controlling for subjects' expectations, there is no difference between "featurally" and "configurally" transformed faces in terms of inversion effect. This result reinforces the plausibility of simple hierarchical models of object representation and recognition in cortex.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Local descriptors are increasingly used for the task of object recognition because of their perceived robustness with respect to occlusions and to global geometrical deformations. We propose a performance criterion for a local descriptor based on the tradeoff between selectivity and invariance. In this paper, we evaluate several local descriptors with respect to selectivity and invariance. The descriptors that we evaluated are Gaussian derivatives up to the third order, gray image patches, and Laplacian-based descriptors with either three scales or one scale filters. We compare selectivity and invariance to several affine changes such as rotation, scale, brightness, and viewpoint. Comparisons have been made keeping the dimensionality of the descriptors roughly constant. The overall results indicate a good performance by the descriptor based on a set of oriented Gaussian filters. It is interesting that oriented receptive fields similar to the Gaussian derivatives as well as receptive fields similar to the Laplacian are found in primate visual cortex.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Artifacts made by humans, such as items of furniture and houses, exhibit an enormous amount of variability in shape. In this paper, we concentrate on models of the shapes of objects that are made up of fixed collections of sub-parts whose dimensions and spatial arrangement exhibit variation. Our goals are: to learn these models from data and to use them for recognition. Our emphasis is on learning and recognition from three-dimensional data, to test the basic shape-modeling methodology. In this paper we also demonstrate how to use models learned in three dimensions for recognition of two-dimensional sketches of objects.