975 resultados para Visual Object Identification
Resumo:
A low complex but highly-efficient object counter algorithm is presented that can be embedded in hardware with a low computational power. This is achieved by a novel soft-data association strategy that can handle multimodal distributions.
Resumo:
En el presente trabajo se aborda el problema del seguimiento de objetos, cuyo objetivo es encontrar la trayectoria de un objeto en una secuencia de video. Para ello, se ha desarrollado un método de seguimiento-por-detección que construye un modelo de apariencia en un dominio comprimido usando una nueva e innovadora técnica: “compressive sensing”. La única información necesaria es la situación del objeto a seguir en la primera imagen de la secuencia. El seguimiento de objetos es una aplicación típica del área de visión artificial con un desarrollo de bastantes años. Aun así, sigue siendo una tarea desafiante debido a varios factores: cambios de iluminación, oclusión parcial o total de los objetos y complejidad del fondo de la escena, los cuales deben ser considerados para conseguir un seguimiento robusto. Para lidiar lo más eficazmente posible con estos factores, hemos propuesto un algoritmo de tracking que entrena un clasificador Máquina Vector Soporte (“Support Vector Machine” o SVM en sus siglas en inglés) en modo online para separar los objetos del fondo de la escena. Con este fin, hemos generado nuestro modelo de apariencia por medio de un descriptor de características muy robusto que describe los objetos y el fondo devolviendo un vector de dimensiones muy altas. Por ello, se ha implementado seguidamente un paso para reducir la dimensionalidad de dichos vectores y así poder entrenar nuestro clasificador en un dominio mucho menor, al que denominamos domino comprimido. La reducción de la dimensionalidad de los vectores de características se basa en la teoría de “compressive sensing”, que dice que una señal con poca dispersión (pocos componentes distintos de cero) puede estar bien representada, e incluso puede ser reconstruida, a partir de un conjunto muy pequeño de muestras. La teoría de “compressive sensing” se ha aplicado satisfactoriamente en este trabajo y diferentes técnicas de medida y reconstrucción han sido probadas para evaluar nuestros vectores reducidos, de tal forma que se ha verificado que son capaces de preservar la información de los vectores originales. También incluimos una actualización del modelo de apariencia del objeto a seguir, mediante el reentrenamiento de nuestro clasificador en cada cuadro de la secuencia con muestras positivas y negativas, las cuales han sido obtenidas a partir de la posición predicha por el algoritmo de seguimiento en cada instante temporal. El algoritmo propuesto ha sido evaluado en distintas secuencias y comparado con otros algoritmos del estado del arte de seguimiento, para así demostrar el éxito de nuestro método.
Resumo:
The emergence of new horizons in the field of travel assistant management leads to the development of cutting-edge systems focused on improving the existing ones. Moreover, new opportunities are being also presented since systems trend to be more reliable and autonomous. In this paper, a self-learning embedded system for object identification based on adaptive-cooperative dynamic approaches is presented for intelligent sensor’s infrastructures. The proposed system is able to detect and identify moving objects using a dynamic decision tree. Consequently, it combines machine learning algorithms and cooperative strategies in order to make the system more adaptive to changing environments. Therefore, the proposed system may be very useful for many applications like shadow tolls since several types of vehicles may be distinguished, parking optimization systems, improved traffic conditions systems, etc.
Resumo:
Les études sont mitigées sur les séquelles cognitives des commotions cérébrales, certaines suggèrent qu’elles se résorbent rapidement tandis que d’autres indiquent qu’elles persistent dans le temps. Par contre, aucunes données n’existent pour indiquer si une tâche cognitive comme l’imagerie mentale visuelle fait ressortir des séquelles à la suite d’une commotion cérébrale. Ainsi, la présente étude a pour objet d’évaluer l’effet des commotions cérébrales d’origine sportive sur la capacité d’imagerie mentale visuelle d’objets et d’imagerie spatiale des athlètes. Afin de répondre à cet objectif, nous comparons les capacités d’imagerie mentale chez des joueurs de football masculins de calibre universitaire sans historique répertorié de commotions cérébrales (n=15) et chez un second groupe d’athlète ayant été victime d’au moins une commotion cérébrale (n=15). Notre hypothèse est que les athlètes non-commotionnés ont une meilleure imagerie mentale que les athlètes commotionnés. Les résultats infirment notre hypothèse. Les athlètes commotionnés performent aussi bien que les athlètes non-commotionnés aux trois tests suivants : Paper Folding Test (PFT), Visual Object Identification Task (VOIT) et Vividness of Visual Imagery Questionnaire (VVIQ). De plus, ni le nombre de commotions cérébrales ni le temps écoulé depuis la dernière commotion cérébrale n’influent sur la performance des athlètes commotionnés.
Resumo:
Classic identity negative priming (NP) refers to the finding that when an object is ignored, subsequent naming responses to it are slower than when it has not been previously ignored (Tipper, S.P., 1985. The negative priming effect: inhibitory priming by ignored objects. Q. J. Exp. Psychol. 37A, 571-590). It is unclear whether this phenomenon arises due to the involvement of abstract semantic representations that the ignored object accesses automatically. Contemporary connectionist models propose a key role for the anterior temporal cortex in the representation of abstract semantic knowledge (e.g., McClelland, J.L., Rogers, T.T., 2003. The parallel distributed processing approach to semantic cognition. Nat. Rev. Neurosci. 4, 310-322), suggesting that this region should be involved during performance of the classic identity NP task if it involves semantic access. Using high-field (4 T) event-related functional magnetic resonance imaging, we observed increased BOLD responses in the left anterolateral temporal cortex including the temporal pole that was directly related to the magnitude of each individual's NP effect, supporting a semantic locus. Additional signal increases were observed in the supplementary eye fields (SEF) and left inferior parietal lobule (IPL). (c) 2006 Elsevier Inc. All rights reserved.
Resumo:
Thesis (Ph.D.)--University of Washington, 2016-06
Resumo:
The project “Reference in Discourse” deals with the selection of a specific object from a visual scene in a natural language situation. The goal of this research is to explain this everyday discourse reference task in terms of a concept generation process based on subconceptual visual and verbal information. The system OINC (Object Identification in Natural Communicators) aims at solving this problem in a psychologically adequate way. The system’s difficulties occurring with incomplete and deviant descriptions correspond to the data from experiments with human subjects. The results of these experiments are reported.
Resumo:
A cor é um atributo perceptual que nos permite identificar e localizar padrões ambientais de mesmo brilho e constitui uma dimensão adicional na identificação de objetos, além da detecção de inúmeros outros atributos dos objetos em sua relação com a cena visual, como luminância, contraste, forma, movimento, textura, profundidade. Decorre daí a sua importância fundamental nas atividades desempenhadas pelos animais e pelos seres humanos em sua interação com o ambiente. A psicofísica visual preocupa-se com o estudo quantitativo da relação entre eventos físicos de estimulação sensorial e a resposta comportamental resultante desta estimulação, fornecendo dessa maneira meios de avaliar aspectos da visão humana, como a visão de cores. Este artigo tem o objetivo de mostrar diversas técnicas eficientes na avaliação da visão cromática humana através de métodos psicofísicos adaptativos.
Resumo:
Currently the world swiftly adapts to visual communication. Online services like YouTube and Vine show that video is no longer the domain of broadcast television only. Video is used for different purposes like entertainment, information, education or communication. The rapid growth of today’s video archives with sparsely available editorial data creates a big problem of its retrieval. The humans see a video like a complex interplay of cognitive concepts. As a result there is a need to build a bridge between numeric values and semantic concepts. This establishes a connection that will facilitate videos’ retrieval by humans. The critical aspect of this bridge is video annotation. The process could be done manually or automatically. Manual annotation is very tedious, subjective and expensive. Therefore automatic annotation is being actively studied. In this thesis we focus on the multimedia content automatic annotation. Namely the use of analysis techniques for information retrieval allowing to automatically extract metadata from video in a videomail system. Furthermore the identification of text, people, actions, spaces, objects, including animals and plants. Hence it will be possible to align multimedia content with the text presented in the email message and the creation of applications for semantic video database indexing and retrieving.
Resumo:
Past multisensory experiences can influence current unisensory processing and memory performance. Repeated images are better discriminated if initially presented as auditory-visual pairs, rather than only visually. An experience's context thus plays a role in how well repetitions of certain aspects are later recognized. Here, we investigated factors during the initial multisensory experience that are essential for generating improved memory performance. Subjects discriminated repeated versus initial image presentations intermixed within a continuous recognition task. Half of initial presentations were multisensory, and all repetitions were only visual. Experiment 1 examined whether purely episodic multisensory information suffices for enhancing later discrimination performance by pairing visual objects with either tones or vibrations. We could therefore also assess whether effects can be elicited with different sensory pairings. Experiment 2 examined semantic context by manipulating the congruence between auditory and visual object stimuli within blocks of trials. Relative to images only encountered visually, accuracy in discriminating image repetitions was significantly impaired by auditory-visual, yet unaffected by somatosensory-visual multisensory memory traces. By contrast, this accuracy was selectively enhanced for visual stimuli with semantically congruent multisensory pasts and unchanged for those with semantically incongruent multisensory pasts. The collective results reveal opposing effects of purely episodic versus semantic information from auditory-visual multisensory events. Nonetheless, both types of multisensory memory traces are accessible for processing incoming stimuli and indeed result in distinct visual object processing, leading to either impaired or enhanced performance relative to unisensory memory traces. We discuss these results as supporting a model of object-based multisensory interactions.
Resumo:
Identification of low-dimensional structures and main sources of variation from multivariate data are fundamental tasks in data analysis. Many methods aimed at these tasks involve solution of an optimization problem. Thus, the objective of this thesis is to develop computationally efficient and theoretically justified methods for solving such problems. Most of the thesis is based on a statistical model, where ridges of the density estimated from the data are considered as relevant features. Finding ridges, that are generalized maxima, necessitates development of advanced optimization methods. An efficient and convergent trust region Newton method for projecting a point onto a ridge of the underlying density is developed for this purpose. The method is utilized in a differential equation-based approach for tracing ridges and computing projection coordinates along them. The density estimation is done nonparametrically by using Gaussian kernels. This allows application of ridge-based methods with only mild assumptions on the underlying structure of the data. The statistical model and the ridge finding methods are adapted to two different applications. The first one is extraction of curvilinear structures from noisy data mixed with background clutter. The second one is a novel nonlinear generalization of principal component analysis (PCA) and its extension to time series data. The methods have a wide range of potential applications, where most of the earlier approaches are inadequate. Examples include identification of faults from seismic data and identification of filaments from cosmological data. Applicability of the nonlinear PCA to climate analysis and reconstruction of periodic patterns from noisy time series data are also demonstrated. Other contributions of the thesis include development of an efficient semidefinite optimization method for embedding graphs into the Euclidean space. The method produces structure-preserving embeddings that maximize interpoint distances. It is primarily developed for dimensionality reduction, but has also potential applications in graph theory and various areas of physics, chemistry and engineering. Asymptotic behaviour of ridges and maxima of Gaussian kernel densities is also investigated when the kernel bandwidth approaches infinity. The results are applied to the nonlinear PCA and to finding significant maxima of such densities, which is a typical problem in visual object tracking.
Resumo:
L’effet d’encombrement, qui nous empêche d’identifier correctement un stimulus visuel lorsqu’il est entouré de flanqueurs, est omniprésent à travers une grande variété de classes de stimuli. L’excentricité du stimulus cible ainsi que la distance cible-flanqueur constituent des facteurs fondamentaux qui modulent l’effet d’encombrement. La similarité cible-flanqueur semble également contribuer à l’ampleur de l’effet d’encombrement, selon des données obtenues avec des stimuli non-linguistiques. La présente étude a examiné ces trois facteurs en conjonction avec le contenu en fréquences spatiales des stimuli, dans une tâche d’identification de lettres. Nous avons présenté des images filtrées de lettres à des sujets non-dyslexiques exempts de troubles neurologiques, tout en manipulant l’excentricité de la cible ainsi que la similarité cible-flanqueurs (selon des matrices de confusion pré-établies). Quatre types de filtrage de fréquences spatiales ont été utilisés : passe-bas, passe-haut, à large bande et mixte (i.e. élimination des fréquences moyennes, connues comme étant optimales pour l’identification de lettres). Ces conditions étaient appariées en termes d’énergie de contraste. Les sujets devaient identifier la lettre cible le plus rapidement possible en évitant de commettre une erreur. Les résultats démontrent que la similarité cible-flanqueur amplifie l’effet d’encombrement, i.e. l’effet conjoint de distance et d’excentricité. Ceci étend les connaissances sur l’impact de la similarité sur l’encombrement à l’identification visuelle de stimuli linguistiques. De plus, la magnitude de l’effet d’encombrement est plus grande avec le filtre passe-bas, suivit du filtre mixte, du filtre passe-haut et du filtre à large bande, avec différences significatives entre les conditions consécutives. Nous concluons que : 1- les fréquences spatiales moyennes offrent une protection optimale contre l’encombrement en identification de lettres; 2- lorsque les fréquences spatiales moyennes sont absentes du stimulus, les hautes fréquences protègent contre l’encombrement alors que les basses fréquences l’amplifient, probablement par l’entremise de leur impact opposé quant la disponibilité de l’information sur les caractéristiques distinctives des stimul.
Resumo:
A cor é um atributo perceptual que nos permite identificar e localizar padrões ambientais de mesmo brilho e constitui uma dimensão adicional na identificação de objetos, além da detecção de inúmeros outros atributos dos objetos em sua relação com a cena visual, como luminância, contraste, forma, movimento, textura, profundidade. Decorre daí a sua importância fundamental nas atividades desempenhadas pelos animais e pelos seres humanos em sua interação com o ambiente. A psicofísica visual preocupa-se com o estudo quantitativo da relação entre eventos físicos de estimulação sensorial e a resposta comportamental resultante desta estimulação, fornecendo dessa maneira meios de avaliar aspectos da visão humana, como a visão de cores. Este artigo tem o objetivo de mostrar diversas técnicas eficientes na avaliação da visão cromática humana através de métodos psicofísicos adaptativos.
Resumo:
OBJECTIVE: To test the prediction by the Perception and Attention Deficit (PAD) model of complex visual hallucinations that cognitive impairment, specifically in visual attention, is a key risk factor for complex hallucinations in eye disease. METHODS: Two studies of elderly patients with acquired eye disease investigated the relationship between complex visual hallucinations (CVH) and impairments in general cognition and verbal attention (Study 1) and between CVH, selective visual attention and visual object perception (Study 2). The North East Visual Hallucinations Inventory was used to classify CVH. RESULTS: In Study 1, there was no relationship between CVH (n=10/39) and performance on cognitive screening or verbal attention tasks. In Study 2, participants with CVH (n=11/31) showed poorer performance on a modified Stroop task (p<0.05), a novel imagery-based attentional task (p<0.05) and picture (p<0.05) but not silhouette naming (p=0.13) tasks. Performance on these tasks correctly classified 83% of the participants as hallucinators or non-hallucinators. CONCLUSIONS: The results suggest that, consistent with the PAD model, complex visual hallucinations in people with acquired eye disease are associated with visual attention impairment.
Resumo:
Background: A prerequisite for high performance in motor tasks is the acquisition of egocentric sensory information that must be translated into motor actions. A phenomenon that supports this process is the Quiet Eye (QE) defined as long final fixation before movement initiation. It is assumed that the QE facilitates information processing, particularly regarding movement parameterization. Aims: The question remains whether this facilitation also holds for the information-processing stage of response selection and – related to perception crucial – stage of stimulus identification. Method: In two experiments with sport science students, performance-enhancing effects of experimentally manipulated QE durations were tested as a function of target position predictability and target visibility, thereby selectively manipulating response selection and stimulus identification demands, respectively. Results: The results support the hypothesis of facilitated information processing through long QE durations since in both experiments performance-enhancing effects of long QE durations were found under increased processing demands only. In Experiment 1, QE duration affected performance only if the target position was not predictable and positional information had to be processed over the QE period. In Experiment 2, in a full vs. no target visibility comparison with saccades to the upcoming target position induced by flicker cues, the functionality of a long QE duration depended on the visual stimulus identification period as soon as the interval falls below a certain threshold. Conclusions: The results corroborate earlier findings that QE efficiency depends on demands put on the visuomotor system, thereby furthering the assumption that the phenomenon supports the processes of sensorimotor integration.