998 resultados para Contextual visual localization
Resumo:
In this paper, we present a novel coarse-to-fine visual localization approach: contextual visual localization. This approach relies on three elements: (i) a minimal-complexity classifier for performing fast coarse localization (submap classification); (ii) an optimized saliency detector which exploits the visual statistics of the submap; and (iii) a fast view-matching algorithm which filters initial matchings with a structural criterion. The latter algorithm yields fine localization. Our experiments show that these elements have been successfully integrated for solving the global localization problem. Context, that is, the awareness of being in a particular submap, is defined by a supervised classifier tuned for a minimal set of features. Visual context is exploited both for tuning (optimizing) the saliency detection process, and to select potential matching views in the visual database, close enough to the query view.
Resumo:
In the last decade, local image features have been widely used in robot visual localization. To assess image similarity, a strategy exploiting these features compares raw descriptors extracted from the current image to those in the models of places. This paper addresses the ensuing step in this process, where a combining function must be used to aggregate results and assign each place a score. Casting the problem in the multiple classifier systems framework, we compare several candidate combiners with respect to their performance in the visual localization task. A deeper insight into the potential of the sum and product combiners is provided by testing two extensions of these algebraic rules: threshold and weighted modifications. In addition, a voting method, previously used in robot visual localization, is assessed. All combiners are tested on a visual localization task, carried out on a public dataset. It is experimentally demonstrated that the sum rule extensions globally achieve the best performance. The voting method, whilst competitive to the algebraic rules in their standard form, is shown to be outperformed by both their modified versions.
Resumo:
In the last decade, local image features have been widely used in robot visual localization. In order to assess image similarity, a strategy exploiting these features compares raw descriptors extracted from the current image with those in the models of places. This paper addresses the ensuing step in this process, where a combining function must be used to aggregate results and assign each place a score. Casting the problem in the multiple classifier systems framework, in this paper we compare several candidate combiners with respect to their performance in the visual localization task. For this evaluation, we selected the most popular methods in the class of non-trained combiners, namely the sum rule and product rule. A deeper insight into the potential of these combiners is provided through a discriminativity analysis involving the algebraic rules and two extensions of these methods: the threshold, as well as the weighted modifications. In addition, a voting method, previously used in robot visual localization, is assessed. Furthermore, we address the process of constructing a model of the environment by describing how the model granularity impacts upon performance. All combiners are tested on a visual localization task, carried out on a public dataset. It is experimentally demonstrated that the sum rule extensions globally achieve the best performance, confirming the general agreement on the robustness of this rule in other classification problems. The voting method, whilst competitive with the product rule in its standard form, is shown to be outperformed by its modified versions.
Resumo:
Localization, which is the ability of a mobile robot to estimate its position within its environment, is a key capability for autonomous operation of any mobile robot. This thesis presents a system for indoor coarse and global localization of a mobile robot based on visual information. The system is based on image matching and uses SIFT features as natural landmarks. Features extracted from training images arestored in a database for use in localization later. During localization an image of the scene is captured using the on-board camera of the robot, features are extracted from the image and the best match is searched from the database. Feature matching is done using the k-d tree algorithm. Experimental results showed that localization accuracy increases with the number of training features used in the training database, while, on the other hand, increasing number of features tended to have a negative impact on the computational time. For some parts of the environment the error rate was relatively high due to a strong correlation of features taken from those places across the environment.
Resumo:
Les terrains vacants sont, à Montréal, des éléments indéniables dans la composition de la trame urbaine. Leur présence soutenue intéresse déjà depuis longtemps de nombreux auteurs et décideurs municipaux. Toutefois, il s’avère que l’on connaît peu les caractéristiques paysagères de ces espaces. Cette recherche en aménagement vise à compléter nos connaissances sur cette typologie d’espace urbain. Elle porte sur la caractérisation paysagère des terrains vacants du centre-ville de Montréal et sur l’étude de leur potentiel visuel à mettre en valeur les attributs significatifs du paysage urbain. Ces deux études doivent permettre de comprendre le rôle joué par ces vides dans la perception du paysage urbain. Cette démarche s’interroge sur la possibilité que certains vides puissent être justifiés et légitimés en regard de la notion de lisibilité du paysage urbain (Lynch, 1976, 1982). Les terrains joueraient un rôle important au niveau de la perception des paysages urbains. Il s’agit de démontrer le potentiel des espaces vacants dans la mise en valeur du paysage urbain, dans l’optique, pour certains d’entre eux, de légitimer le vide ou une partie du vide qui les définit, de les rendre structurants dans la composition urbaine. Grâce à un travail d’observation des caractéristiques urbaines, contextuelles, visuelles et physiques, l’étude a pu à la fois dresser le portrait de ces espaces en attente de développement urbain et démontrer leur implication dans la lisibilité urbaine. Ce travail présente l’intérêt d’offrir un énoncé sur la planification du développement des terrains vacants du centre-ville de Montréal en regard de la notion de lisibilité urbaine partie prenante dans la qualité urbaine.
Resumo:
In this paper, we propose a novel method for the unsupervised clustering of graphs in the context of the constellation approach to object recognition. Such method is an EM central clustering algorithm which builds prototypical graphs on the basis of fast matching with graph transformations. Our experiments, both with random graphs and in realistic situations (visual localization), show that our prototypes improve the set median graphs and also the prototypes derived from our previous incremental method. We also discuss how the method scales with a growing number of images.
Resumo:
This paper discusses the target localization problem of wireless visual sensor networks. Specifically, each node with a low-resolution camera extracts multiple feature points to represent the target at the sensor node level. A statistical method of merging the position information of different sensor nodes to select the most correlated feature point pair at the base station is presented. This method releases the influence of the accuracy of target extraction on the accuracy of target localization in universal coordinate system. Simulations show that, compared with other relative approach, our proposed method can generate more desirable target localization's accuracy, and it has a better trade-off between camera node usage and localization accuracy.
Resumo:
This paper discusses the target localization problem in wireless visual sensor networks. Additive noises and measurement errors will affect the accuracy of target localization when the visual nodes are equipped with low-resolution cameras. In the goal of improving the accuracy of target localization without prior knowledge of the target, each node extracts multiple feature points from images to represent the target at the sensor node level. A statistical method is presented to match the most correlated feature point pair for merging the position information of different sensor nodes at the base station. Besides, in the case that more than one target exists in the field of interest, a scheme for locating multiple targets is provided. Simulation results show that, our proposed method has desirable performance in improving the accuracy of locating single target or multiple targets. Results also show that the proposed method has a better trade-off between camera node usage and localization accuracy.
Resumo:
Blindsight is a phenomenon in which human patients with damage to striate cortex deny any visual sensation in the resultant visual field defect but can nonetheless detect and localize stimuli when persuaded to guess. Although monkeys with striate lesions have also been shown to exhibit some residual vision, it is not yet clear to what extent the residual capacities in monkeys parallel the phenomenon of human blindsight. To clarify this issue, we trained two monkeys with unilateral lesions of striate cortex to make saccadic eye movements to visual targets in both hemifields under two conditions. In the condition analogous to clinical perimetry, they failed to initiate saccades to targets presented in the contralateral hemifield and thus appeared "blind." Only in the condition where the fixation point was turned off simultaneously with the onset of the target--signaling the animal to respond at the appropriate time--were monkeys able to localize targets contralateral to the striate lesion. These results indicate that the conditions under which residual vision is demonstrable are similar for monkeys with striate cortex damage and humans with blindsight.
Resumo:
This paper illustrates a method for finding useful visual landmarks for performing simultaneous localization and mapping (SLAM). The method is based loosely on biological principles, using layers of filtering and pooling to create learned templates that correspond to different views of the environment. Rather than using a set of landmarks and reporting range and bearing to the landmark, this system maps views to poses. The challenge is to produce a system that produces the same view for small changes in robot pose, but provides different views for larger changes in pose. The method has been developed to interface with the RatSLAM system, a biologically inspired method of SLAM. The paper describes the method of learning and recalling visual landmarks in detail, and shows the performance of the visual system in real robot tests.
Resumo:
Probabilistic robotics most often applied to the problem of simultaneous localisation and mapping (SLAM), requires measures of uncertainty to accompany observations of the environment. This paper describes how uncertainty can be characterised for a vision system that locates coloured landmarks in a typical laboratory environment. The paper describes a model of the uncertainty in segmentation, the internal cameral model and the mounting of the camera on the robot. It explains the implementation of the system on a laboratory robot, and provides experimental results that show the coherence of the uncertainty model.
Resumo:
The topography of the visual evoked magnetic response (VEMR) to a pattern onset stimulus was investigated using 4 check sizes and 3 contrast levels. The pattern onset response consists of three early components within the first 200ms, CIm, CIIm and CIIIm. The CIIm is usually of high amplitude and is very consistent in latency within a subject. Half field (HF) stimuli produce their strongest response over the contralateral hemisphere; the RHF stimulus exhibiting a lower positivity (outgoing field) and an upper negativity (ingoing field), rotated towards the midline. LHF stimulation produced the opposite response, a lower negative and an upper positive. Larger check sizes produce a single area of ingoing and outgoing field while smaller checks produce on area of ingoing and outgoing field over each hemisphere. Latency did not appear to vary with change in contrast but amplitudes increased with increasing contrast. A more detailed topographic study incorporating source localisation procedures suggested a source for CIIm - 4cm below the scalp, close to the midline with current flowing towards the lateral surface. Similar depth and position estimates but with opposite polarity were obtained for the pattern shift P100m previously. Hence, the P100m and the CIIm may originate in similar areas of visual cortex but reveal different aspects of visual processing. © 1992 Human Sciences Press, Inc.
Resumo:
The topography of the visual evoked magnetic response (VEMR) to pattern reversal stimulation was studied in four normal subjects using a single channel BTI magnetometer. VEMRs were recorded from 20 locations over the occipital scalp and the topographic distribution of the most consistent component (P100M) studied. A single dipole in a sphere model was fitted to the data. Topographic maps were similar when recorded two months apart on the same subject to the same stimulus. Half field (HF) stimulation elicited responses from sources on the medial surface of the calcarine fissure mainly in the contralateral hemisphere as predicted by the cruciform model. The full field (FF) responses to large checks were approximately the sum of the HF responses. However, with small checks, FF stimulation appeared to activate a different combination of sources than the two HFs. In addition, HF topography was more consistent between subjects than FF for small check sizes. Topographic studies of the VEMR may help to explain the analogous visual evoked electrical response and will be essential to define optimal recording positions for clinical applications.
Resumo:
From the beginning of the twentieth century, ``Modernism`` impacted and transformed art and clothing. Pablo Picasso and Gabrielle ``Coco`` Chanel were two of the most central characters in Modernism working simultaneously in their disciplines. Picasso`s innovations, particularly in abstract art and Chanel`s fashion designs, that dramatically departed from the previous corseted and highly deco-rative styles, were so significant that they have left an influence on contemporary art and fashion. This study will compare their visual works and documented evidence of their motivations, within the context of their cultural backgrounds, to reveal meaning in the occurrences of overlaps. This approach has ex-amined the historical, cultural background of the artist and designer`s environment from different per-spectives, adding to previous research in this area. Through this research, outcomes of the analysis have shown similarities and divergences in the wider genres of art and fashion and the practice of the artist and fashion designer. The reference list to this text, used in the survey, gives a comprehensive overview of pertinent publications disseminating Picasso and Chanel`s visual works, oral perspectives and cultural impact.
Resumo:
In this article we describe a semantic localization dataset for indoor environments named ViDRILO. The dataset provides five sequences of frames acquired with a mobile robot in two similar office buildings under different lighting conditions. Each frame consists of a point cloud representation of the scene and a perspective image. The frames in the dataset are annotated with the semantic category of the scene, but also with the presence or absence of a list of predefined objects appearing in the scene. In addition to the frames and annotations, the dataset is distributed with a set of tools for its use in both place classification and object recognition tasks. The large number of labeled frames in conjunction with the annotation scheme make this dataset different from existing ones. The ViDRILO dataset is released for use as a benchmark for different problems such as multimodal place classification and object recognition, 3D reconstruction or point cloud data compression.