987 resultados para local feature
Resumo:
In the last decade, local image features have been widely used in robot visual localization. To assess image similarity, a strategy exploiting these features compares raw descriptors extracted from the current image to those in the models of places. This paper addresses the ensuing step in this process, where a combining function must be used to aggregate results and assign each place a score. Casting the problem in the multiple classifier systems framework, we compare several candidate combiners with respect to their performance in the visual localization task. A deeper insight into the potential of the sum and product combiners is provided by testing two extensions of these algebraic rules: threshold and weighted modifications. In addition, a voting method, previously used in robot visual localization, is assessed. All combiners are tested on a visual localization task, carried out on a public dataset. It is experimentally demonstrated that the sum rule extensions globally achieve the best performance. The voting method, whilst competitive to the algebraic rules in their standard form, is shown to be outperformed by both their modified versions.
Resumo:
In the last decade, local image features have been widely used in robot visual localization. In order to assess image similarity, a strategy exploiting these features compares raw descriptors extracted from the current image with those in the models of places. This paper addresses the ensuing step in this process, where a combining function must be used to aggregate results and assign each place a score. Casting the problem in the multiple classifier systems framework, in this paper we compare several candidate combiners with respect to their performance in the visual localization task. For this evaluation, we selected the most popular methods in the class of non-trained combiners, namely the sum rule and product rule. A deeper insight into the potential of these combiners is provided through a discriminativity analysis involving the algebraic rules and two extensions of these methods: the threshold, as well as the weighted modifications. In addition, a voting method, previously used in robot visual localization, is assessed. Furthermore, we address the process of constructing a model of the environment by describing how the model granularity impacts upon performance. All combiners are tested on a visual localization task, carried out on a public dataset. It is experimentally demonstrated that the sum rule extensions globally achieve the best performance, confirming the general agreement on the robustness of this rule in other classification problems. The voting method, whilst competitive with the product rule in its standard form, is shown to be outperformed by its modified versions.
Resumo:
Local features are used in many computer vision tasks including visual object categorization, content-based image retrieval and object recognition to mention a few. Local features are points, blobs or regions in images that are extracted using a local feature detector. To make use of extracted local features the localized interest points are described using a local feature descriptor. A descriptor histogram vector is a compact representation of an image and can be used for searching and matching images in databases. In this thesis the performance of local feature detectors and descriptors is evaluated for object class detection task. Features are extracted from image samples belonging to several object classes. Matching features are then searched using random image pairs of a same class. The goal of this thesis is to find out what are the best detector and descriptor methods for such task in terms of detector repeatability and descriptor matching rate.
Resumo:
The usage of digital content, such as video clips and images, has increased dramatically during the last decade. Local image features have been applied increasingly in various image and video retrieval applications. This thesis evaluates local features and applies them to image and video processing tasks. The results of the study show that 1) the performance of different local feature detector and descriptor methods vary significantly in object class matching, 2) local features can be applied in image alignment with superior results against the state-of-the-art, 3) the local feature based shot boundary detection method produces promising results, and 4) the local feature based hierarchical video summarization method shows promising new new research direction. In conclusion, this thesis presents the local features as a powerful tool in many applications and the imminent future work should concentrate on improving the quality of the local features.
Resumo:
When we actively explore the visual environment, our gaze preferentially selects regions characterized by high contrast and high density of edges, suggesting that the guidance of eye movements during visual exploration is driven to a significant degree by perceptual characteristics of a scene. Converging findings suggest that the selection of the visual target for the upcoming saccade critically depends on a covert shift of spatial attention. However, it is unclear whether attention selects the location of the next fixation uniquely on the basis of global scene structure or additionally on local perceptual information. To investigate the role of spatial attention in scene processing, we examined eye fixation patterns of patients with spatial neglect during unconstrained exploration of natural images and compared these to healthy and brain-injured control participants. We computed luminance, colour, contrast, and edge information contained in image patches surrounding each fixation and evaluated whether they differed from randomly selected image patches. At the global level, neglect patients showed the characteristic ipsilesional shift of the distribution of their fixations. At the local level, patients with neglect and control participants fixated image regions in ipsilesional space that were closely similar with respect to their local feature content. In contrast, when directing their gaze to contralesional (impaired) space neglect patients fixated regions of significantly higher local luminance and lower edge content than controls. These results suggest that intact spatial attention is necessary for the active sampling of local feature content during scene perception.
Resumo:
One of the most significant research topics in computer vision is object detection. Most of the reported object detection results localise the detected object within a bounding box, but do not explicitly label the edge contours of the object. Since object contours provide a fundamental diagnostic of object shape, some researchers have initiated work on linear contour feature representations for object detection and localisation. However, linear contour feature-based localisation is highly dependent on the performance of linear contour detection within natural images, and this can be perturbed significantly by a cluttered background. In addition, the conventional approach to achieving rotation-invariant features is to rotate the feature receptive field to align with the local dominant orientation before computing the feature representation. Grid resampling after rotation adds extra computational cost and increases the total time consumption for computing the feature descriptor. Though it is not an expensive process if using current computers, it is appreciated that if each step of the implementation is faster to compute especially when the number of local features is increasing and the application is implemented on resource limited ”smart devices”, such as mobile phones, in real-time. Motivated by the above issues, a 2D object localisation system is proposed in this thesis that matches features of edge contour points, which is an alternative method that takes advantage of the shape information for object localisation. This is inspired by edge contour points comprising the basic components of shape contours. In addition, edge point detection is usually simpler to achieve than linear edge contour detection. Therefore, the proposed localization system could avoid the need for linear contour detection and reduce the pathological disruption from the image background. Moreover, since natural images usually comprise many more edge contour points than interest points (i.e. corner points), we also propose new methods to generate rotation-invariant local feature descriptors without pre-rotating the feature receptive field to improve the computational efficiency of the whole system. In detail, the 2D object localisation system is achieved by matching edge contour points features in a constrained search area based on the initial pose-estimate produced by a prior object detection process. The local feature descriptor obtains rotation invariance by making use of rotational symmetry of the hexagonal structure. Therefore, a set of local feature descriptors is proposed based on the hierarchically hexagonal grouping structure. Ultimately, the 2D object localisation system achieves a very promising performance based on matching the proposed features of edge contour points with the mean correct labelling rate of the edge contour points 0.8654 and the mean false labelling rate 0.0314 applied on the data from Amsterdam Library of Object Images (ALOI). Furthermore, the proposed descriptors are evaluated by comparing to the state-of-the-art descriptors and achieve competitive performances in terms of pose estimate with around half-pixel pose error.
Resumo:
We consider brightness/contrast-invariant and rotation-discriminating template matching that searches an image to analyze A for a query image Q We propose to use the complex coefficients of the discrete Fourier transform of the radial projections to compute new rotation-invariant local features. These coefficients can be efficiently obtained via FFT. We classify templates in ""stable"" and ""unstable"" ones and argue that any local feature-based template matching may fail to find unstable templates. We extract several stable sub-templates of Q and find them in A by comparing the features. The matchings of the sub-templates are combined using the Hough transform. As the features of A are computed only once, the algorithm can find quickly many different sub-templates in A, and it is Suitable for finding many query images in A, multi-scale searching and partial occlusion-robust template matching. (C) 2009 Elsevier Ltd. All rights reserved.
Resumo:
In visual sensor networks, local feature descriptors can be computed at the sensing nodes, which work collaboratively on the data obtained to make an efficient visual analysis. In fact, with a minimal amount of computational effort, the detection and extraction of local features, such as binary descriptors, can provide a reliable and compact image representation. In this paper, it is proposed to extract and code binary descriptors to meet the energy and bandwidth constraints at each sensing node. The major contribution is a binary descriptor coding technique that exploits the correlation using two different coding modes: Intra, which exploits the correlation between the elements that compose a descriptor; and Inter, which exploits the correlation between descriptors of the same image. The experimental results show bitrate savings up to 35% without any impact in the performance efficiency of the image retrieval task. © 2014 EURASIP.
Resumo:
The number of digital images has been increasing exponentially in the last few years. People have problems managing their image collections and finding a specific image. An automatic image categorization system could help them to manage images and find specific images. In this thesis, an unsupervised visual object categorization system was implemented to categorize a set of unknown images. The system is unsupervised, and hence, it does not need known images to train the system which needs to be manually obtained. Therefore, the number of possible categories and images can be huge. The system implemented in the thesis extracts local features from the images. These local features are used to build a codebook. The local features and the codebook are then used to generate a feature vector for an image. Images are categorized based on the feature vectors. The system is able to categorize any given set of images based on the visual appearance of the images. Images that have similar image regions are grouped together in the same category. Thus, for example, images which contain cars are assigned to the same cluster. The unsupervised visual object categorization system can be used in many situations, e.g., in an Internet search engine. The system can categorize images for a user, and the user can then easily find a specific type of image.
Resumo:
The large and growing number of digital images is making manual image search laborious. Only a fraction of the images contain metadata that can be used to search for a particular type of image. Thus, the main research question of this thesis is whether it is possible to learn visual object categories directly from images. Computers process images as long lists of pixels that do not have a clear connection to high-level semantics which could be used in the image search. There are various methods introduced in the literature to extract low-level image features and also approaches to connect these low-level features with high-level semantics. One of these approaches is called Bag-of-Features which is studied in the thesis. In the Bag-of-Features approach, the images are described using a visual codebook. The codebook is built from the descriptions of the image patches using clustering. The images are described by matching descriptions of image patches with the visual codebook and computing the number of matches for each code. In this thesis, unsupervised visual object categorisation using the Bag-of-Features approach is studied. The goal is to find groups of similar images, e.g., images that contain an object from the same category. The standard Bag-of-Features approach is improved by using spatial information and visual saliency. It was found that the performance of the visual object categorisation can be improved by using spatial information of local features to verify the matches. However, this process is computationally heavy, and thus, the number of images must be limited in the spatial matching, for example, by using the Bag-of-Features method as in this study. Different approaches for saliency detection are studied and a new method based on the Hessian-Affine local feature detector is proposed. The new method achieves comparable results with current state-of-the-art. The visual object categorisation performance was improved by using foreground segmentation based on saliency information, especially when the background could be considered as clutter.
Resumo:
This paper presents a Robust Content Based Video Retrieval (CBVR) system. This system retrieves similar videos based on a local feature descriptor called SURF (Speeded Up Robust Feature). The higher dimensionality of SURF like feature descriptors causes huge storage consumption during indexing of video information. To achieve a dimensionality reduction on the SURF feature descriptor, this system employs a stochastic dimensionality reduction method and thus provides a model data for the videos. On retrieval, the model data of the test clip is classified to its similar videos using a minimum distance classifier. The performance of this system is evaluated using two different minimum distance classifiers during the retrieval stage. The experimental analyses performed on the system shows that the system has a retrieval performance of 78%. This system also analyses the performance efficiency of the low dimensional SURF descriptor.
Resumo:
As the popularity of digital videos increases, a large number illegal videos are being generated and getting published. Video copies are generated by performing various sorts of transformations on the original video data. For effectively identifying such illegal videos, the image features that are invariant to various transformations must be extracted for performing similarity matching. An image feature can be its local feature or global feature. Among them, local features are powerful and have been applied in a wide variety of computer vision aplications .This paper focuses on various recently proposed local detectors and descriptors that are invariant to a number of image transformations.
Resumo:
Pós-graduação em História - FCHS
Resumo:
Abstract Background Recently, it was realized that the functional connectivity networks estimated from actual brain-imaging technologies (MEG, fMRI and EEG) can be analyzed by means of the graph theory, that is a mathematical representation of a network, which is essentially reduced to nodes and connections between them. Methods We used high-resolution EEG technology to enhance the poor spatial information of the EEG activity on the scalp and it gives a measure of the electrical activity on the cortical surface. Afterwards, we used the Directed Transfer Function (DTF) that is a multivariate spectral measure for the estimation of the directional influences between any given pair of channels in a multivariate dataset. Finally, a graph theoretical approach was used to model the brain networks as graphs. These methods were used to analyze the structure of cortical connectivity during the attempt to move a paralyzed limb in a group (N=5) of spinal cord injured patients and during the movement execution in a group (N=5) of healthy subjects. Results Analysis performed on the cortical networks estimated from the group of normal and SCI patients revealed that both groups present few nodes with a high out-degree value (i.e. outgoing links). This property is valid in the networks estimated for all the frequency bands investigated. In particular, cingulate motor areas (CMAs) ROIs act as ‘‘hubs’’ for the outflow of information in both groups, SCI and healthy. Results also suggest that spinal cord injuries affect the functional architecture of the cortical network sub-serving the volition of motor acts mainly in its local feature property. In particular, a higher local efficiency El can be observed in the SCI patients for three frequency bands, theta (3-6 Hz), alpha (7-12 Hz) and beta (13-29 Hz). By taking into account all the possible pathways between different ROI couples, we were able to separate clearly the network properties of the SCI group from the CTRL group. In particular, we report a sort of compensatory mechanism in the SCI patients for the Theta (3-6 Hz) frequency band, indicating a higher level of “activation” Ω within the cortical network during the motor task. The activation index is directly related to diffusion, a type of dynamics that underlies several biological systems including possible spreading of neuronal activation across several cortical regions. Conclusions The present study aims at demonstrating the possible applications of graph theoretical approaches in the analyses of brain functional connectivity from EEG signals. In particular, the methodological aspects of the i) cortical activity from scalp EEG signals, ii) functional connectivity estimations iii) graph theoretical indexes are emphasized in the present paper to show their impact in a real application.
Resumo:
Theories of image segmentation suggest that the human visual system may use two distinct processes to segregate figure from background: a local process that uses local feature contrasts to mark borders of coherent regions and a global process that groups similar features over a larger spatial scale. We performed psychophysical experiments to determine whether and to what extent the global similarity process contributes to image segmentation by motion and color. Our results show that for color, as well as for motion, segmentation occurs first by an integrative process on a coarse spatial scale, demonstrating that for both modalities the global process is faster than one based on local feature contrasts. Segmentation by motion builds up over time, whereas segmentation by color does not, indicating a fundamental difference between the modalities. Our data suggest that segmentation by motion proceeds first via a cooperative linking over space of local motion signals, generating almost immediate perceptual coherence even of physically incoherent signals. This global segmentation process occurs faster than the detection of absolute motion, providing further evidence for the existence of two motion processes with distinct dynamic properties.