28 resultados para Bag-of-visual Words
em Consorci de Serveis Universitaris de Catalunya (CSUC), Spain
Resumo:
We investigate whether dimensionality reduction using a latent generative model is beneficial for the task of weakly supervised scene classification. In detail, we are given a set of labeled images of scenes (for example, coast, forest, city, river, etc.), and our objective is to classify a new image into one of these categories. Our approach consists of first discovering latent ";topics"; using probabilistic Latent Semantic Analysis (pLSA), a generative model from the statistical text literature here applied to a bag of visual words representation for each image, and subsequently, training a multiway classifier on the topic distribution vector for each image. We compare this approach to that of representing each image by a bag of visual words vector directly and training a multiway classifier on these vectors. To this end, we introduce a novel vocabulary using dense color SIFT descriptors and then investigate the classification performance under changes in the size of the visual vocabulary, the number of latent topics learned, and the type of discriminative classifier used (k-nearest neighbor or SVM). We achieve superior classification performance to recent publications that have used a bag of visual word representation, in all cases, using the authors' own data sets and testing protocols. We also investigate the gain in adding spatial information. We show applications to image retrieval with relevance feedback and to scene classification in videos
Resumo:
Vegeu el resum a l'inici del document del fitxer adjunt.
Resumo:
This study addresses the issue of Spanish plural marking considering data from three sources: existent words, loan words and nonce words. Specifically, we are interested in the role of stress placement and word-final sound in the use of /-es/ for plural formation. We present data concerning the interaction of these two features for children and adults. Our findings suggest that this phenomenon is a classic example of over generalization in acquisition. Stress does not seem a determining feature by itself. Its main effect is produced when it interacts with the structure of the syllable
Resumo:
Tone Mapping is the problem of compressing the range of a High-Dynamic Range image so that it can be displayed in a Low-Dynamic Range screen, without losing or introducing novel details: The final image should produce in the observer a sensation as close as possible to the perception produced by the real-world scene. We propose a tone mapping operator with two stages. The first stage is a global method that implements visual adaptation, based on experiments on human perception, in particular we point out the importance of cone saturation. The second stage performs local contrast enhancement, based on a variational model inspired by color vision phenomenology. We evaluate this method with a metric validated by psychophysical experiments and, in terms of this metric, our method compares very well with the state of the art.
Resumo:
En el presente artículo se ha desarrollado un sistema capaz de categorizar de forma automática la base de datos de imágenes que sirven de punto de partida para la ideación y diseño en la producción artística del escultor M. Planas. La metodología utilizada está basada en características locales. Para la construcción de un vocabulario visual se sigue un procedimiento análogo al que se utiliza en el análisis automático de textos (modelo 'Bag-of-Words'-BOW) y en el ámbito de las imágenes nos referiremos a representaciones 'Bag-of-Visual Terms' (BOV). En este enfoque se analizan las imágenes como un conjunto de regiones, describiendo solamente su apariencia e ignorando su estructura espacial. Para superar los inconvenientes de polisemia y sinonimia que lleva asociados esta metodología, se utiliza el análisis probabilístico de aspectos latentes (PLSA) que detecta aspectos subyacentes en las imágenes, patrones formales. Los resultados obtenidos son prometedores y, además de la utilidad intrínseca de la categorización automática de imágenes, este método puede proporcionar al artista un punto de vista auxiliar muy interesante.
Resumo:
The heated debate over whether there is only a single mechanism or two mechanisms for morphology has diverted valuable research energy away from the more critical questions about the neural computations involved in the comprehension and production of morphologically complex forms. Cognitive neuroscience data implicate many brain areas. All extant models, whether they rely on a connectionist network or espouse two mechanisms, are too underspecified to explain why more than a few brain areas differ in their activity during the processing of regular and irregular forms. No one doubts that the brain treats regular and irregular words differently, but brain data indicate that a simplistic account will not do. It is time for us to search for the critical factors free from theoretical blinders.
Resumo:
Report for the scientific sojourn at the Swiss Federal Institute of Technology Zurich, Switzerland, between September and December 2007. In order to make robots useful assistants for our everyday life, the ability to learn and recognize objects is of essential importance. However, object recognition in real scenes is one of the most challenging problems in computer vision, as it is necessary to deal with difficulties. Furthermore, in mobile robotics a new challenge is added to the list: computational complexity. In a dynamic world, information about the objects in the scene can become obsolete before it is ready to be used if the detection algorithm is not fast enough. Two recent object recognition techniques have achieved notable results: the constellation approach proposed by Lowe and the bag of words approach proposed by Nistér and Stewénius. The Lowe constellation approach is the one currently being used in the robot localization project of the COGNIRON project. This report is divided in two main sections. The first section is devoted to briefly review the currently used object recognition system, the Lowe approach, and bring to light the drawbacks found for object recognition in the context of indoor mobile robot navigation. Additionally the proposed improvements for the algorithm are described. In the second section the alternative bag of words method is reviewed, as well as several experiments conducted to evaluate its performance with our own object databases. Furthermore, some modifications to the original algorithm to make it suitable for object detection in unsegmented images are proposed.
Resumo:
Collage is a pattern-based visual design authoring tool for the creation of collaborative learning scripts computationally modelled with IMS Learning Design (LD). The pattern-based visual approach aims to provide teachers with design ideas that are based on broadly accepted practices. Besides, it seeks hiding the LD notation so that teachers can easily create their own designs. The use of visual representations supports both the understanding of the design ideas and the usability of the authoring tool. This paper presents a multicase study comprising three different cases that evaluate the approach from different perspectives. The first case includes workshops where teachers use Collage. A second case implies the design of a scenario proposed by a third-party using related approaches. The third case analyzes a situation where students follow a design created with Collage. The cross-case analysis provides a global understanding of the possibilities and limitations of the pattern-based visual design approach.
Resumo:
The meaning of a novel word can be acquired by extracting it from linguistic context. Here we simulated word learning of new words associated to concrete and abstract concepts in a variant of the human simulation paradigm that provided linguistic context information in order to characterize the brain systems involved. Native speakers of Spanish read pairs of sentences in order to derive the meaning of a new word that appeared in the terminal position of the sentences. fMRI revealed that learning the meaning associated to concrete and abstract new words was qualitatively different and recruited similar brain regions as the processing of real concrete and abstract words. In particular, learning of new concrete words selectively boosted the activation of the ventral anterior fusiform gyrus, a region driven by imageability, which has previously been implicated in the processing of concrete words.
Resumo:
Total lack of visual experience [dark rearing (DR)] is known to prolong the critical period and delay development of sensory functions in mammalian visual cortex. Recent results show that neurotrophins (NTs) counteract the effects of DR on functional properties of visual cortical cells and exert a strong control on critical period duration. NTs are known to modulate the development and synaptic efficacy of neurotransmitter systems that are affected by DR. However, it is still unknown whether the actions of NTs in dark-reared animals involve interaction with neurotransmitter systems. We have studied the effects of DR on the expression of key molecules in the glutamatergic and GABAergic systems in control and NT-treated animals. We have found that DR reduced the expression of the NMDA receptor 2A subunit and its associated protein PSD-95 (postsynaptic density-95), of GRIP (AMPA glutamate receptor interacting protein), and of the biosynthetic enzyme GAD (glutamic acid decarboxylase). Returning dark-reared animals to light for 2 hr restored normal expression of the above-mentioned proteins almost completely. NT treatment specifically counteracts DR effects; NGF acts primarily on the NMDA system, whereas BDNF acts primarily on the GABAergic system. Finally, the action of NT4 seems to involve both excitatory and inhibitory systems. These data demonstrate that different NTs counteract DR effects by modulating the expression of key molecules of the excitatory and inhibitory neurotransmitter systems
Resumo:
BACKGROUND: In the context of population aging, visual impairment has emerged as a growing concern in public health. However, there is a need for further research into the relationship between visual impairment and chronic medical conditions in the elderly. The aim of our study was to examine the relationship between visual impairment and three main types of co-morbidity: chronic physical conditions (both at an independent and additive level), mental health and cognitive functioning. METHODS: Data were collected from the COURAGE in Europe project, a cross-sectional study. A total of 4,583 participants from Spain were included. Diagnosis of chronic medical conditions included self-reported medical diagnosis and symptomatic algorithms. Depression and anxiety were assessed using CIDI algorithms. Visual assessment included objective distance/near visual acuity and subjective visual performance. Descriptive analyses included the whole sample (n = 4,583). Statistical analyses included participants aged over 50 years (n = 3,625; mean age = 66.45 years) since they have a significant prevalence of chronic conditions and visual impairment. Crude and adjusted binary logistic regressions were performed to identify independent associations between visual impairment and chronic medical conditions, physical multimorbidity and mental conditions. Covariates included age, gender, marital status, education level, employment status and urbanicity. RESULTS: The number of chronic physical conditions was found to be associated with poorer results in both distance and near visual acuity [OR 1.75 (CI 1.38-2.23); OR 1.69 (CI 1.27-2.24)]. At an independent level, arthritis, stroke and diabetes were associated with poorer distance visual acuity results after adjusting for covariates [OR 1.79 (CI 1.46-2.21); OR 1.59 (CI 1.05-2.42); OR 1.27 (1.01-1.60)]. Only stroke was associated with near visual impairment [OR 3.01 (CI 1.86-4.87)]. With regard to mental health, poor subjective visual acuity was associated with depression [OR 1.61 (CI 1.14-2.27); OR 1.48 (CI 1.03-2.13)]. Both objective and subjective poor distance and near visual acuity were associated with worse cognitive functioning. CONCLUSIONS: Arthritis, stroke and the co-occurrence of various chronic physical diseases are associated with higher prevalence of visual impairment. Visual impairment is associated with higher prevalence of depression and poorer cognitive function results. There is a need to implement patient-centered care involving special visual assessment in these cases.
Resumo:
The usual way to investigate the statistical properties of finitely generated subgroups of free groups, and of finite presentations of groups, is based on the so-called word-based distribution: subgroups are generated (finite presentations are determined) by randomly chosen k-tuples of reduced words, whose maximal length is allowed to tend to infinity. In this paper we adopt a different, though equally natural point of view: we investigate the statistical properties of the same objects, but with respect to the so-called graph-based distribution, recently introduced by Bassino, Nicaud and Weil. Here, subgroups (and finite presentations) are determined by randomly chosen Stallings graphs whose number of vertices tends to infinity. Our results show that these two distributions behave quite differently from each other, shedding a new light on which properties of finitely generated subgroups can be considered frequent or rare. For example, we show that malnormal subgroups of a free group are negligible in the raph-based distribution, while they are exponentially generic in the word-based distribution. Quite surprisingly, a random finite presentation generically presents the trivial group in this new distribution, while in the classical one it is known to generically present an infinite hyperbolic group.
Resumo:
This paper focuses on the problem of realizing a plane-to-plane virtual link between a camera attached to the end-effector of a robot and a planar object. In order to do the system independent to the object surface appearance, a structured light emitter is linked to the camera so that 4 laser pointers are projected onto the object. In a previous paper we showed that such a system has good performance and nice characteristics like partial decoupling near the desired state and robustness against misalignment of the emitter and the camera (J. Pages et al., 2004). However, no analytical results concerning the global asymptotic stability of the system were obtained due to the high complexity of the visual features utilized. In this work we present a better set of visual features which improves the properties of the features in (J. Pages et al., 2004) and for which it is possible to prove the global asymptotic stability
Resumo:
In this paper we face the problem of positioning a camera attached to the end-effector of a robotic manipulator so that it gets parallel to a planar object. Such problem has been treated for a long time in visual servoing. Our approach is based on linking to the camera several laser pointers so that its configuration is aimed to produce a suitable set of visual features. The aim of using structured light is not only for easing the image processing and to allow low-textured objects to be treated, but also for producing a control scheme with nice properties like decoupling, stability, well conditioning and good camera trajectory
Resumo:
Report for the scientific sojourn carried out at the University Medical Center, Swiss, from 2010 to 2012. Abundant evidence suggests that negative emotional stimuli are prioritized in the perceptual systems, eliciting enhanced neural responses in early sensory regions as compared with neutral information. This facilitated detection is generally paralleled by larger neural responses in early sensory areas, relative to the processing of neutral information. In this sense, the amygdala and other limbic regions, such as the orbitofrontal cortex, may play a critical role by sending modulatory projections onto the sensory cortices via direct or indirect feedback.The present project aimed at investigating two important issues regarding these mechanisms of emotional attention, by means of functional magnetic resonance imaging. In Study I, we examined the modulatory effects of visual emotion signals on the processing of task-irrelevant visual, auditory, and somatosensory input, that is, the intramodal and crossmodal effects of emotional attention. We observed that brain responses to auditory and tactile stimulation were enhanced during the processing of visual emotional stimuli, as compared to neutral, in bilateral primary auditory and somatosensory cortices, respectively. However, brain responses to visual task-irrelevant stimulation were diminished in left primary and secondary visual cortices in the same conditions. The results also suggested the existence of a multimodal network associated with emotional attention, presumably involving mediofrontal, temporal and orbitofrontal regions Finally, Study II examined the different brain responses along the low-level visual pathways and limbic regions, as a function of the number of retinal spikes during visual emotional processing. The experiment used stimuli resulting from an algorithm that simulates how the visual system perceives a visual input after a given number of retinal spikes. The results validated the visual model in human subjects and suggested differential emotional responses in the amygdala and visual regions as a function of spike-levels. A list of publications resulting from work in the host laboratory is included in the report.