929 resultados para PASCAL Visual Object Classes (VOC)
Resumo:
The brain extracts useful features from a maelstrom of sensory information, and a fundamental goal of theoretical neuroscience is to work out how it does so. One proposed feature extraction strategy is motivated by the observation that the meaning of sensory data, such as the identity of a moving visual object, is often more persistent than the activation of any single sensory receptor. This notion is embodied in the slow feature analysis (SFA) algorithm, which uses “slowness” as an heuristic by which to extract semantic information from multi-dimensional time-series. Here, we develop a probabilistic interpretation of this algorithm showing that inference and learning in the limiting case of a suitable probabilistic model yield exactly the results of SFA. Similar equivalences have proved useful in interpreting and extending comparable algorithms such as independent component analysis. For SFA, we use the equivalent probabilistic model as a conceptual spring-board, with which to motivate several novel extensions to the algorithm.
Resumo:
Several studies have shown that sensory contextual cues can reduce the interference observed during learning of opposing force fields. However, because each study examined a small set of cues, often in a unique paradigm, the relative efficacy of different sensory contextual cues is unclear. In the present study we quantify how seven contextual cues, some investigated previously and some novel, affect the formation and recall of motor memories. Subjects made movements in a velocity-dependent curl field, with direction varying randomly from trial to trial but always associated with a unique contextual cue. Linking field direction to the cursor or background color, or to peripheral visual motion cues, did not reduce interference. In contrast, the orientation of a visual object attached to the hand cursor significantly reduced interference, albeit by a small amount. When the fields were associated with movement in different locations in the workspace, a substantial reduction in interference was observed. We tested whether this reduction in interference was due to the different locations of the visual feedback (targets and cursor) or the movements (proprioceptive). When the fields were associated only with changes in visual display location (movements always made centrally) or only with changes in the movement location (visual feedback always displayed centrally), a substantial reduction in interference was observed. These results show that although some visual cues can lead to the formation and recall of distinct representations in motor memory, changes in spatial visual and proprioceptive states of the movement are far more effective than changes in simple visual contextual cues.
Resumo:
本文针对异构分布环境下的多机器人系统 ,提出了一种基于 CORBA规范和框架请求代理 (FRB)这一方式的应用系统集成模型 .为实现分布异构环境下的多机器人的通信、协同、编程 ,支持系统任务重组与重构、应用互操作提供一条有效途径 .并给出了一个自行设计的面向多机器人基于CORBA的对象互操作的方法、机制 .给出基于 CORBA的应用编程接口 (API)与扩展接口定义语言X- IDL,通过基于 CORBA的多机器人基本对象类的建立与对象的实现 ,实现基于 CORBA的多机器人互操作与开放分布处理原型系统 .
Resumo:
Visual object recognition requires the matching of an image with a set of models stored in memory. In this paper we propose an approach to recognition in which a 3-D object is represented by the linear combination of 2-D images of the object. If M = {M1,...Mk} is the set of pictures representing a given object, and P is the 2-D image of an object to be recognized, then P is considered an instance of M if P = Eki=aiMi for some constants ai. We show that this approach handles correctly rigid 3-D transformations of objects with sharp as well as smooth boundaries, and can also handle non-rigid transformations. The paper is divided into two parts. In the first part we show that the variety of views depicting the same object under different transformations can often be expressed as the linear combinations of a small number of views. In the second part we suggest how this linear combinatino property may be used in the recognition process.
Resumo:
We propose to investigate a model-based technique for encoding non-rigid object classes in terms of object prototypes. Objects from the same class can be parameterized by identifying shape and appearance invariants of the class to devise low-level representations. The approach presented here creates a flexible model for an object class from a set of prototypes. This model is then used to estimate the parameters of low-level representation of novel objects as combinations of the prototype parameters. Variations in the object shape are modeled as non-rigid deformations. Appearance variations are modeled as intensity variations. In the training phase, the system is presented with several example prototype images. These prototype images are registered to a reference image by a finite element-based technique called Active Blobs. The deformations of the finite element model to register a prototype image with the reference image provide the shape description or shape vector for the prototype. The shape vector for each prototype, is then used to warp the prototype image onto the reference image and obtain the corresponding texture vector. The prototype texture vectors, being warped onto the same reference image have a pixel by pixel correspondence with each other and hence are "shape normalized". Given sufficient number of prototypes that exhibit appropriate in-class variations, the shape and the texture vectors define a linear prototype subspace that spans the object class. Each prototype is a vector in this subspace. The matching phase involves the estimation of a set of combination parameters for synthesis of the novel object by combining the prototype shape and texture vectors. The strengths of this technique lie in the combined estimation of both shape and appearance parameters. This is in contrast with the previous approaches where shape and appearance parameters were estimated separately.
Resumo:
Anterior inferotemporal cortex (ITa) plays a key role in visual object recognition. Recognition is tolerant to object position, size, and view changes, yet recent neurophysiological data show ITa cells with high object selectivity often have low position tolerance, and vice versa. A neural model learns to simulate both this tradeoff and ITa responses to image morphs using large-scale and small-scale IT cells whose population properties may support invariant recognition.
Resumo:
Les temps de réponse dans une tache de reconnaissance d’objets visuels diminuent de façon significative lorsque les cibles peuvent être distinguées à partir de deux attributs redondants. Le gain de redondance pour deux attributs est un résultat commun dans la littérature, mais un gain causé par trois attributs redondants n’a été observé que lorsque ces trois attributs venaient de trois modalités différentes (tactile, auditive et visuelle). La présente étude démontre que le gain de redondance pour trois attributs de la même modalité est effectivement possible. Elle inclut aussi une investigation plus détaillée des caractéristiques du gain de redondance. Celles-ci incluent, outre la diminution des temps de réponse, une diminution des temps de réponses minimaux particulièrement et une augmentation de la symétrie de la distribution des temps de réponse. Cette étude présente des indices que ni les modèles de course, ni les modèles de coactivation ne sont en mesure d’expliquer l’ensemble des caractéristiques du gain de redondance. Dans ce contexte, nous introduisons une nouvelle méthode pour évaluer le triple gain de redondance basée sur la performance des cibles doublement redondantes. Le modèle de cascade est présenté afin d’expliquer les résultats de cette étude. Ce modèle comporte plusieurs voies de traitement qui sont déclenchées par une cascade d’activations avant de satisfaire un seul critère de décision. Il offre une approche homogène aux recherches antérieures sur le gain de redondance. L’analyse des caractéristiques des distributions de temps de réponse, soit leur moyenne, leur symétrie, leur décalage ou leur étendue, est un outil essentiel pour cette étude. Il était important de trouver un test statistique capable de refléter les différences au niveau de toutes ces caractéristiques. Nous abordons la problématique d’analyser les temps de réponse sans perte d’information, ainsi que l’insuffisance des méthodes d’analyse communes dans ce contexte, comme grouper les temps de réponses de plusieurs participants (e. g. Vincentizing). Les tests de distributions, le plus connu étant le test de Kolmogorov- Smirnoff, constituent une meilleure alternative pour comparer des distributions, celles des temps de réponse en particulier. Un test encore inconnu en psychologie est introduit : le test d’Anderson-Darling à deux échantillons. Les deux tests sont comparés, et puis nous présentons des indices concluants démontrant la puissance du test d’Anderson-Darling : en comparant des distributions qui varient seulement au niveau de (1) leur décalage, (2) leur étendue, (3) leur symétrie, ou (4) leurs extrémités, nous pouvons affirmer que le test d’Anderson-Darling reconnait mieux les différences. De plus, le test d’Anderson-Darling a un taux d’erreur de type I qui correspond exactement à l’alpha tandis que le test de Kolmogorov-Smirnoff est trop conservateur. En conséquence, le test d’Anderson-Darling nécessite moins de données pour atteindre une puissance statistique suffisante.
Resumo:
Les études sont mitigées sur les séquelles cognitives des commotions cérébrales, certaines suggèrent qu’elles se résorbent rapidement tandis que d’autres indiquent qu’elles persistent dans le temps. Par contre, aucunes données n’existent pour indiquer si une tâche cognitive comme l’imagerie mentale visuelle fait ressortir des séquelles à la suite d’une commotion cérébrale. Ainsi, la présente étude a pour objet d’évaluer l’effet des commotions cérébrales d’origine sportive sur la capacité d’imagerie mentale visuelle d’objets et d’imagerie spatiale des athlètes. Afin de répondre à cet objectif, nous comparons les capacités d’imagerie mentale chez des joueurs de football masculins de calibre universitaire sans historique répertorié de commotions cérébrales (n=15) et chez un second groupe d’athlète ayant été victime d’au moins une commotion cérébrale (n=15). Notre hypothèse est que les athlètes non-commotionnés ont une meilleure imagerie mentale que les athlètes commotionnés. Les résultats infirment notre hypothèse. Les athlètes commotionnés performent aussi bien que les athlètes non-commotionnés aux trois tests suivants : Paper Folding Test (PFT), Visual Object Identification Task (VOIT) et Vividness of Visual Imagery Questionnaire (VVIQ). De plus, ni le nombre de commotions cérébrales ni le temps écoulé depuis la dernière commotion cérébrale n’influent sur la performance des athlètes commotionnés.
Resumo:
We describe a method for modeling object classes (such as faces) using 2D example images and an algorithm for matching a model to a novel image. The object class models are "learned'' from example images that we call prototypes. In addition to the images, the pixelwise correspondences between a reference prototype and each of the other prototypes must also be provided. Thus a model consists of a linear combination of prototypical shapes and textures. A stochastic gradient descent algorithm is used to match a model to a novel image by minimizing the error between the model and the novel image. Example models are shown as well as example matches to novel images. The robustness of the matching algorithm is also evaluated. The technique can be used for a number of applications including the computation of correspondence between novel images of a certain known class, object recognition, image synthesis and image compression.
Resumo:
This report presents an algorithm for locating the cut points for and separatingvertically attached traffic signs in Sweden. This algorithm provides severaladvanced digital image processing features: binary image which representsvisual object and its complex rectangle background with number one and zerorespectively, improved cross correlation which shows the similarity of 2Dobjects and filters traffic sign candidates, simplified shape decompositionwhich smoothes contour of visual object iteratively in order to reduce whitenoises, flipping point detection which locates black noises candidates, chasmfilling algorithm which eliminates black noises, determines the final cut pointsand separates originally attached traffic signs into individual ones. At each step,the mediate results as well as the efficiency in practice would be presented toshow the advantages and disadvantages of the developed algorithm. Thisreport concentrates on contour-based recognition of Swedish traffic signs. Thegeneral shapes cover upward triangle, downward triangle, circle, rectangle andoctagon. At last, a demonstration program would be presented to show howthe algorithm works in real-time environment.
Resumo:
Tiere müssen Nahrung, Fortpflanzungspartner oder eine angenehme Umgebung finden und gleichzeitig eventuellen Gefahren aus dem Weg gehen. Eine effektive Orientierungsstrategie stellt für sie einen enormen Vorteil dar, vor allem wenn sie sich in einer komplexen Umwelt bewegen. Eine bisher unbekannte Art, die Orientierung zu optimieren, wird in dieser Arbeit vorgestellt. Sie analysiert, wie sich Taufliegen in einem Temperatur- Gradienten sowie in einer visuell geprägten Umwelt orientieren. Die dabei gefundene Orientierungsstrategie wird als „Memotaxis“ bezeichnet. Sie basiert auf der Integration von Informationen entlang der Wegstrecke, was dazu führt, dass die eingeschlagene Richtung proportional zum positiven Feedback immer stereotyper beibehalten wird. Obwohl die Memotaxis perfekt für die Orientierung in verrauschten Gradienten geeignet ist, wurde ihre Existenz in Situationen mit wenig Rauschen nachgewiesen. Die Strategie führt im Temperaturgradienten dazu, dass Fliegen umso weiter über ein Temperaturoptimum hinweg laufen, je weiter sie vorher darauf zuliefen. Beim Anlauf visueller Stimuli zeigen sie ein ähnliches Verhalten. Je weiter sie auf eine Landmarke zulaufen, desto länger dauert es, bis sie nach deren Verschwinden von dieser Richtung abweichen. Dies gilt auch dann, wenn man gleichzeitig mit dem Verschwinden der Landmarke der Fliege eine andere anbietet. Memotaxis sollte bei vielen Tieren eine gewichtige Rolle spielen, bei der Taufliege können durch die verfügbaren genetischen Methoden zusätzlich die dafür relevanten Gehirnzentren und die biochemischen Komponenten gefunden werden. Der Ellipsoidkörper des Zentralkomplexes ist für die Memotaxis in visuellen Umgebungen notwendig.rnDas Verhalten auf einem vertikalen Laufband wurde analysiert, vor allem im Hinblick auf die adaptive Termination dieses Verhaltens. Die Fliegen erkannten lange Zeit nicht, dass ihr Verhalten nicht zielführend ist und liefen stereotyp und ohne voranzukommen nach oben. Dieses Verhalten wird sogar noch verstärkt, wenn man das visuelle Feedback für die Bewertung ihres Verhaltens verstärkt. rn
Resumo:
The unsupervised categorization of sensory stimuli is typically attributed to feedforward processing in a hierarchy of cortical areas. This purely sensory-driven view of cortical processing, however, ignores any internal modulation, e.g., by top-down attentional signals or neuromodulator release. To isolate the role of internal signaling on category formation, we consider an unbroken continuum of stimuli without intrinsic category boundaries. We show that a competitive network, shaped by recurrent inhibition and endowed with Hebbian and homeostatic synaptic plasticity, can enforce stimulus categorization. The degree of competition is internally controlled by the neuronal gain and the strength of inhibition. Strong competition leads to the formation of many attracting network states, each being evoked by a distinct subset of stimuli and representing a category. Weak competition allows more neurons to be co-active, resulting in fewer but larger categories. We conclude that the granularity of cortical category formation, i.e., the number and size of emerging categories, is not simply determined by the richness of the stimulus environment, but rather by some global internal signal modulating the network dynamics. The model also explains the salient non-additivity of visual object representation observed in the monkey inferotemporal (IT) cortex. Furthermore, it offers an explanation of a previously observed, demand-dependent modulation of IT activity on a stimulus categorization task and of categorization-related cognitive deficits in schizophrenic patients.
Resumo:
The introduction of open-plan offices in the 1960s with the intent of making the workplace more flexible, efficient, and team-oriented resulted in a higher noise floor level, which not only made concentrated work more difficult, but also caused physiological problems, such as increased stress, in addition to a loss of speech privacy. Irrelevant background human speech, in particular, has proven to be a major factor in disrupting concentration and lowering performance. Therefore, reducing the intelligibility of speech and has been a goal of increasing importance in recent years. One method employed to do so is the use of masking noises, which consists in emitting a continuous noise signal over a loudspeaker system that conceals the perturbing speech. Studies have shown that while effective, the maskers employed to date – normally filtered pink noise – are generally poorly accepted by users. The collaborative "Private Workspace" project, within the scope of which this thesis was carried out, attempts to develop a coupled, adaptive noise masking system along with a physical structure to be used for open-plan offices so as to combat these issues. There is evidence to suggest that nature sounds might be more accepted as masker, in part because they can have a visual object that acts as the source for the sound. Direct audio recordings are not recommended for various reasons, and thus the nature sounds must be synthesized. This work done consists of the synthesis of a sound texture to be used as a masker as well as its evaluation. The sound texture is composed of two parts: a wind-like noise synthesized with subtractive synthesis, and a leaf-like noise synthesized through granular synthesis. Different combinations of these two noises produced five variations of the masker, which were evaluated at different levels along with white noise and pink noise using a modified version of an Oldenburger Satztest to test for an affect on speech intelligibility and a questionnaire to asses its subjective acceptance. The goal was to find which of the synthesized noises works best as a speech masker. This thesis first uses a theoretical introduction to establish the basics of sound perception, psychoacoustic masking, and sound texture synthesis. The design of each of the noises, as well as their respective implementations in MATLAB, is explained, followed by the procedures used to evaluate the maskers. The results obtained in the evaluation are analyzed. Lastly, conclusions are drawn and future work is and modifications to the masker are proposed. RESUMEN. La introducción de las oficinas abiertas en los años 60 tenía como objeto flexibilizar el ambiente laboral, hacerlo más eficiente y que estuviera más orientado al trabajo en equipo. Como consecuencia, subió el nivel de ruido de fondo, que no sólo dificulta la concentración, sino que causa problemas fisiológicos, como el aumento del estrés, además de reducir la privacidad. Hay estudios que prueban que las conversaciones de fondo en particular tienen un efecto negativo en el nivel de concentración y disminuyen el rendimiento de los trabajadores. Por lo tanto, reducir la inteligibilidad del habla es uno de los principales objetivos en la actualidad. Un método empleado para hacerlo ha sido el uso de ruido enmascarante, que consiste en reproducir señales continuas de ruido a través de un sistema de altavoces que enmascare el habla. Aunque diversos estudios demuestran que es un método eficaz, los ruidos utilizados hasta la fecha (normalmente ruido rosa filtrado), no son muy bien aceptados por los usuarios. El proyecto colaborativo "Private Workspace", dentro del cual se engloba el trabajo realizado en este Proyecto Fin de Grado, tiene por objeto desarrollar un sistema de ruido enmascarador acoplado y adaptativo, además de una estructura física, para su uso en oficinas abiertas con el fin de combatir los problemas descritos anteriormente. Existen indicios de que los sonidos naturales son mejor aceptados, en parte porque pueden tener una estructura física que simule ser la fuente de los mismos. La utilización de grabaciones directas de estos sonidos no está recomendada por varios motivos, y por lo tanto los sonidos naturales deben ser sintetizados. El presente trabajo consiste en la síntesis de una textura de sonido (en inglés sound texture) para ser usada como ruido enmascarador, además de su evaluación. La textura está compuesta de dos partes: un sonido de viento sintetizado mediante síntesis sustractiva y un sonido de hojas sintetizado mediante síntesis granular. Diferentes combinaciones de estos dos sonidos producen cinco variaciones de ruido enmascarador. Estos cinco ruidos han sido evaluados a diferentes niveles, junto con ruido blanco y ruido rosa, mediante una versión modificada de un Oldenburger Satztest para comprobar cómo afectan a la inteligibilidad del habla, y mediante un cuestionario para una evaluación subjetiva de su aceptación. El objetivo era encontrar qué ruido de los que se han sintetizado funciona mejor como enmascarador del habla. El proyecto consiste en una introducción teórica que establece las bases de la percepción del sonido, el enmascaramiento psicoacústico, y la síntesis de texturas de sonido. Se explica a continuación el diseño de cada uno de los ruidos, así como su implementación en MATLAB. Posteriormente se detallan los procedimientos empleados para evaluarlos. Los resultados obtenidos se analizan y se extraen conclusiones. Por último, se propone un posible trabajo futuro y mejoras al ruido sintetizado.
Resumo:
Classic identity negative priming (NP) refers to the finding that when an object is ignored, subsequent naming responses to it are slower than when it has not been previously ignored (Tipper, S.P., 1985. The negative priming effect: inhibitory priming by ignored objects. Q. J. Exp. Psychol. 37A, 571-590). It is unclear whether this phenomenon arises due to the involvement of abstract semantic representations that the ignored object accesses automatically. Contemporary connectionist models propose a key role for the anterior temporal cortex in the representation of abstract semantic knowledge (e.g., McClelland, J.L., Rogers, T.T., 2003. The parallel distributed processing approach to semantic cognition. Nat. Rev. Neurosci. 4, 310-322), suggesting that this region should be involved during performance of the classic identity NP task if it involves semantic access. Using high-field (4 T) event-related functional magnetic resonance imaging, we observed increased BOLD responses in the left anterolateral temporal cortex including the temporal pole that was directly related to the magnitude of each individual's NP effect, supporting a semantic locus. Additional signal increases were observed in the supplementary eye fields (SEF) and left inferior parietal lobule (IPL). (c) 2006 Elsevier Inc. All rights reserved.
Resumo:
This work is supported by the Hungarian Scientific Research Fund (OTKA), grant T042706.