962 resultados para image matching
Resumo:
This thesis addresses the problem of detecting and describing the same scene points in different wide-angle images taken by the same camera at different viewpoints. This is a core competency of many vision-based localisation tasks including visual odometry and visual place recognition. Wide-angle cameras have a large field of view that can exceed a full hemisphere, and the images they produce contain severe radial distortion. When compared to traditional narrow field of view perspective cameras, more accurate estimates of camera egomotion can be found using the images obtained with wide-angle cameras. The ability to accurately estimate camera egomotion is a fundamental primitive of visual odometry, and this is one of the reasons for the increased popularity in the use of wide-angle cameras for this task. Their large field of view also enables them to capture images of the same regions in a scene taken at very different viewpoints, and this makes them suited for visual place recognition. However, the ability to estimate the camera egomotion and recognise the same scene in two different images is dependent on the ability to reliably detect and describe the same scene points, or ‘keypoints’, in the images. Most algorithms used for this purpose are designed almost exclusively for perspective images. Applying algorithms designed for perspective images directly to wide-angle images is problematic as no account is made for the image distortion. The primary contribution of this thesis is the development of two novel keypoint detectors, and a method of keypoint description, designed for wide-angle images. Both reformulate the Scale- Invariant Feature Transform (SIFT) as an image processing operation on the sphere. As the image captured by any central projection wide-angle camera can be mapped to the sphere, applying these variants to an image on the sphere enables keypoints to be detected in a manner that is invariant to image distortion. Each of the variants is required to find the scale-space representation of an image on the sphere, and they differ in the approaches they used to do this. Extensive experiments using real and synthetically generated wide-angle images are used to validate the two new keypoint detectors and the method of keypoint description. The best of these two new keypoint detectors is applied to vision based localisation tasks including visual odometry and visual place recognition using outdoor wide-angle image sequences. As part of this work, the effect of keypoint coordinate selection on the accuracy of egomotion estimates using the Direct Linear Transform (DLT) is investigated, and a simple weighting scheme is proposed which attempts to account for the uncertainty of keypoint positions during detection. A word reliability metric is also developed for use within a visual ‘bag of words’ approach to place recognition.
Resumo:
Camera calibration information is required in order for multiple camera networks to deliver more than the sum of many single camera systems. Methods exist for manually calibrating cameras with high accuracy. Manually calibrating networks with many cameras is, however, time consuming, expensive and impractical for networks that undergo frequent change. For this reason, automatic calibration techniques have been vigorously researched in recent years. Fully automatic calibration methods depend on the ability to automatically find point correspondences between overlapping views. In typical camera networks, cameras are placed far apart to maximise coverage. This is referred to as a wide base-line scenario. Finding sufficient correspondences for camera calibration in wide base-line scenarios presents a significant challenge. This thesis focuses on developing more effective and efficient techniques for finding correspondences in uncalibrated, wide baseline, multiple-camera scenarios. The project consists of two major areas of work. The first is the development of more effective and efficient view covariant local feature extractors. The second area involves finding methods to extract scene information using the information contained in a limited set of matched affine features. Several novel affine adaptation techniques for salient features have been developed. A method is presented for efficiently computing the discrete scale space primal sketch of local image features. A scale selection method was implemented that makes use of the primal sketch. The primal sketch-based scale selection method has several advantages over the existing methods. It allows greater freedom in how the scale space is sampled, enables more accurate scale selection, is more effective at combining different functions for spatial position and scale selection, and leads to greater computational efficiency. Existing affine adaptation methods make use of the second moment matrix to estimate the local affine shape of local image features. In this thesis, it is shown that the Hessian matrix can be used in a similar way to estimate local feature shape. The Hessian matrix is effective for estimating the shape of blob-like structures, but is less effective for corner structures. It is simpler to compute than the second moment matrix, leading to a significant reduction in computational cost. A wide baseline dense correspondence extraction system, called WiDense, is presented in this thesis. It allows the extraction of large numbers of additional accurate correspondences, given only a few initial putative correspondences. It consists of the following algorithms: An affine region alignment algorithm that ensures accurate alignment between matched features; A method for extracting more matches in the vicinity of a matched pair of affine features, using the alignment information contained in the match; An algorithm for extracting large numbers of highly accurate point correspondences from an aligned pair of feature regions. Experiments show that the correspondences generated by the WiDense system improves the success rate of computing the epipolar geometry of very widely separated views. This new method is successful in many cases where the features produced by the best wide baseline matching algorithms are insufficient for computing the scene geometry.
Resumo:
While researchers strive to improve automatic face recognition performance, the relationship between image resolution and face recognition performance has not received much attention. This relationship is examined systematically and a framework is developed such that results from super-resolution techniques can be compared. Three super-resolution techniques are compared with the Eigenface and Elastic Bunch Graph Matching face recognition engines. Parameter ranges over which these techniques provide better recognition performance than interpolated images is determined.
Resumo:
Affine covariant local image features are a powerful tool for many applications, including matching and calibrating wide baseline images. Local feature extractors that use a saliency map to locate features require adaptation processes in order to extract affine covariant features. The most effective extractors make use of the second moment matrix (SMM) to iteratively estimate the affine shape of local image regions. This paper shows that the Hessian matrix can be used to estimate local affine shape in a similar fashion to the SMM. The Hessian matrix requires significantly less computation effort than the SMM, allowing more efficient affine adaptation. Experimental results indicate that using the Hessian matrix in conjunction with a feature extractor that selects features in regions with high second order gradients delivers equivalent quality correspondences in less than 17% of the processing time, compared to the same extractor using the SMM.
Resumo:
The rank transform is a non-parametric technique which has been recently proposed for the stereo matching problem. The motivation behind its application to the matching problem is its invariance to certain types of image distortion and noise, as well as its amenability to real-time implementation. This paper derives an analytic expression for the process of matching using the rank transform, and then goes on to derive one constraint which must be satisfied for a correct match. This has been dubbed the rank order constraint or simply the rank constraint. Experimental work has shown that this constraint is capable of resolving ambiguous matches, thereby improving matching reliability. This constraint was incorporated into a new algorithm for matching using the rank transform. This modified algorithm resulted in an increased proportion of correct matches, for all test imagery used.
Resumo:
Data structures such as k-D trees and hierarchical k-means trees perform very well in approximate k nearest neighbour matching, but are only marginally more effective than linear search when performing exact matching in high-dimensional image descriptor data. This paper presents several improvements to linear search that allows it to outperform existing methods and recommends two approaches to exact matching. The first method reduces the number of operations by evaluating the distance measure in order of significance of the query dimensions and terminating when the partial distance exceeds the search threshold. This method does not require preprocessing and significantly outperforms existing methods. The second method improves query speed further by presorting the data using a data structure called d-D sort. The order information is used as a priority queue to reduce the time taken to find the exact match and to restrict the range of data searched. Construction of the d-D sort structure is very simple to implement, does not require any parameter tuning, and requires significantly less time than the best-performing tree structure, and data can be added to the structure relatively efficiently.
Resumo:
Whole image descriptors have recently been shown to be remarkably robust to perceptual change especially compared to local features. However, whole-image-based localization systems typically rely on heuristic methods for determining appropriate matching thresholds in a particular environment. These environment-specific tuning requirements and the lack of a meaningful interpretation of these arbitrary thresholds limits the general applicability of these systems. In this paper we present a Bayesian model of probability for whole-image descriptors that can be seamlessly integrated into localization systems designed for probabilistic visual input. We demonstrate this method using CAT-Graph, an appearance-based visual localization system originally designed for a FAB-MAP-style probabilistic input. We show that using whole-image descriptors as visual input extends CAT-Graph’s functionality to environments that experience a greater amount of perceptual change. We also present a method of estimating whole-image probability models in an online manner, removing the need for a prior training phase. We show that this online, automated training method can perform comparably to pre-trained, manually tuned local descriptor methods.
Resumo:
This paper describes a novel system for automatic classification of images obtained from Anti-Nuclear Antibody (ANA) pathology tests on Human Epithelial type 2 (HEp-2) cells using the Indirect Immunofluorescence (IIF) protocol. The IIF protocol on HEp-2 cells has been the hallmark method to identify the presence of ANAs, due to its high sensitivity and the large range of antigens that can be detected. However, it suffers from numerous shortcomings, such as being subjective as well as time and labour intensive. Computer Aided Diagnostic (CAD) systems have been developed to address these problems, which automatically classify a HEp-2 cell image into one of its known patterns (eg. speckled, homogeneous). Most of the existing CAD systems use handpicked features to represent a HEp-2 cell image, which may only work in limited scenarios. We propose a novel automatic cell image classification method termed Cell Pyramid Matching (CPM), which is comprised of regional histograms of visual words coupled with the Multiple Kernel Learning framework. We present a study of several variations of generating histograms and show the efficacy of the system on two publicly available datasets: the ICPR HEp-2 cell classification contest dataset and the SNPHEp-2 dataset.
Resumo:
Existing multi-model approaches for image set classification extract local models by clustering each image set individually only once, with fixed clusters used for matching with other image sets. However, this may result in the two closest clusters to represent different characteristics of an object, due to different undesirable environmental conditions (such as variations in illumination and pose). To address this problem, we propose to constrain the clustering of each query image set by forcing the clusters to have resemblance to the clusters in the gallery image sets. We first define a Frobenius norm distance between subspaces over Grassmann manifolds based on reconstruction error. We then extract local linear subspaces from a gallery image set via sparse representation. For each local linear subspace, we adaptively construct the corresponding closest subspace from the samples of a probe image set by joint sparse representation. We show that by minimising the sparse representation reconstruction error, we approach the nearest point on a Grassmann manifold. Experiments on Honda, ETH-80 and Cambridge-Gesture datasets show that the proposed method consistently outperforms several other recent techniques, such as Affine Hull based Image Set Distance (AHISD), Sparse Approximated Nearest Points (SANP) and Manifold Discriminant Analysis (MDA).
Resumo:
Non-rigid image registration is an essential tool required for overcoming the inherent local anatomical variations that exist between images acquired from different individuals or atlases. Furthermore, certain applications require this type of registration to operate across images acquired from different imaging modalities. One popular local approach for estimating this registration is a block matching procedure utilising the mutual information criterion. However, previous block matching procedures generate a sparse deformation field containing displacement estimates at uniformly spaced locations. This neglects to make use of the evidence that block matching results are dependent on the amount of local information content. This paper presents a solution to this drawback by proposing the use of a Reversible Jump Markov Chain Monte Carlo statistical procedure to optimally select grid points of interest. Three different methods are then compared to propagate the estimated sparse deformation field to the entire image including a thin-plate spline warp, Gaussian convolution, and a hybrid fluid technique. Results show that non-rigid registration can be improved by using the proposed algorithm to optimally select grid points of interest.
Resumo:
Due to its three-dimensional folding pattern, the human neocortex; poses a challenge for accurate co-registration of grouped functional; brain imaging data. The present study addressed this problem by; employing three-dimensional continuum-mechanical image-warping; techniques to derive average anatomical representations for coregistration; of functional magnetic resonance brain imaging data; obtained from 10 male first-episode schizophrenia patients and 10 age-matched; male healthy volunteers while they performed a version of the; Tower of London task. This novel technique produced an equivalent; representation of blood oxygenation level dependent (BOLD) response; across hemispheres, cortical regions, and groups, respectively, when; compared to intensity average co-registration, using a deformable; Brodmann area atlas as anatomical reference. Somewhat closer; association of Brodmann area boundaries with primary visual and; auditory areas was evident using the gyral pattern average model.; Statistically-thresholded BOLD cluster data confirmed predominantly; bilateral prefrontal and parietal, right frontal and dorsolateral; prefrontal, and left occipital activation in healthy subjects, while; patients’ hemispheric dominance pattern was diminished or reversed,; particularly decreasing cortical BOLD response with increasing task; difficulty in the right superior temporal gyrus. Reduced regional gray; matter thickness correlated with reduced left-hemispheric prefrontal/; frontal and bilateral parietal BOLD activation in patients. This is the; first study demonstrating that reduction of regional gray matter in; first-episode schizophrenia patients is associated with impaired brain; function when performing the Tower of London task, and supports; previous findings of impaired executive attention and working memory; in schizophrenia.
Resumo:
In this paper, we present a new feature-based approach for mosaicing of camera-captured document images. A novel block-based scheme is employed to ensure that corners can be reliably detected over a wide range of images. 2-D discrete cosine transform is computed for image blocks defined around each of the detected corners and a small subset of the coefficients is used as a feature vector A 2-pass feature matching is performed to establish point correspondences from which the homography relating the input images could be computed. The algorithm is tested on a number of complex document images casually taken from a hand-held camera yielding convincing results.
Resumo:
Fusion of multi-sensor imaging data enables a synergetic interpretation of complementary information obtained by sensors of different spectral ranges. Multi-sensor data of diverse spectral, spatial and temporal resolutions require advanced numerical techniques for analysis and interpretation. This paper reviews ten advanced pixel based image fusion techniques – Component substitution (COS), Local mean and variance matching, Modified IHS (Intensity Hue Saturation), Fast Fourier Transformed-enhanced IHS, Laplacian Pyramid, Local regression, Smoothing filter (SF), Sparkle, SVHC and Synthetic Variable Ratio. The above techniques were tested on IKONOS data (Panchromatic band at 1 m spatial resolution and Multispectral 4 bands at 4 m spatial resolution). Evaluation of the fused results through various accuracy measures, revealed that SF and COS methods produce images closest to corresponding multi-sensor would observe at the highest resolution level (1 m).
Resumo:
A new technique is proposed for multisensor image registration by matching the features using discrete particle swarm optimization (DPSO). The feature points are first extracted from the reference and sensed image using improved Harris corner detector available in the literature. From the extracted corner points, DPSO finds the three corresponding points in the sensed and reference images using multiobjective optimization of distance and angle conditions through objective switching technique. By this, the global best matched points are obtained which are used to evaluate the affine transformation for the sensed image. The performance of the image registration is evaluated and concluded that the proposed approach is efficient.
Resumo:
This paper investigates a novel approach for point matching of multi-sensor satellite imagery. The feature (corner) points extracted using an improved version of the Harris Corner Detector (HCD) is matched using multi-objective optimization based on a Genetic Algorithm (GA). An objective switching approach to optimization that incorporates an angle criterion, distance condition and point matching condition in the multi-objective fitness function is applied to match corresponding corner-points between the reference image and the sensed image. The matched points obtained in this way are used to align the sensed image with a reference image by applying an affine transformation. From the results obtained, the performance of the image registration is evaluated and compared with existing methods, namely Nearest Neighbor-Random SAmple Consensus (NN-Ran-SAC) and multi-objective Discrete Particle Swarm Optimization (DPSO). From the performed experiments it can be concluded that the proposed approach is an accurate method for registration of multi-sensor satellite imagery. (C) 2014 Elsevier Inc. All rights reserved.