934 resultados para Place recognition algorithm
Resumo:
An algorithm of PCA face recognition based on Multi-degree of Freedom Neurons theory is proposed, which based on the sample sets' topological character in the feature space which is different from "classification". Compare with the traditional PCA+NN algorithm, experiments prove its efficiency.
Resumo:
This paper discusses the algorithm on the distance from a point and an infinite sub-space in high dimensional space With the development of Information Geometry([1]), the analysis tools of points distribution in high dimension space, as a measure of calculability, draw more attention of experts of pattern recognition. By the assistance of these tools, Geometrical properties of sets of samples in high-dimensional structures are studied, under guidance of the established properties and theorems in high-dimensional geometry.
Resumo:
In recognition-based user interface, users’ satisfaction is determined not only by recognition accuracy but also by effort to correct recognition errors. In this paper, we introduce a crossmodal error correction technique, which allows users to correct errors of Chinese handwriting recognition by speech. The focus of the paper is a multimodal fusion algorithm supporting the crossmodal error correction. By fusing handwriting and speech recognition, the algorithm can correct errors in both character extraction and recognition of handwriting. The experimental result indicates that the algorithm is effective and efficient. Moreover, the evaluation also shows the correction technique can help users to correct errors in handwriting recognition more efficiently than the other two error correction techniques.
Resumo:
Subspace learning is the process of finding a proper feature subspace and then projecting high-dimensional data onto the learned low-dimensional subspace. The projection operation requires many floating-point multiplications and additions, which makes the projection process computationally expensive. To tackle this problem, this paper proposes two simple-but-effective fast subspace learning and image projection methods, fast Haar transform (FHT) based principal component analysis and FHT based spectral regression discriminant analysis. The advantages of these two methods result from employing both the FHT for subspace learning and the integral vector for feature extraction. Experimental results on three face databases demonstrated their effectiveness and efficiency.
Resumo:
In this paper we present some extensions to the k-means algorithm for vector quantization that permit its efficient use in image segmentation and pattern classification tasks. It is shown that by introducing state variables that correspond to certain statistics of the dynamic behavior of the algorithm, it is possible to find the representative centers fo the lower dimensional maniforlds that define the boundaries between classes, for clouds of multi-dimensional, mult-class data; this permits one, for example, to find class boundaries directly from sparse data (e.g., in image segmentation tasks) or to efficiently place centers for pattern classification (e.g., with local Gaussian classifiers). The same state variables can be used to define algorithms for determining adaptively the optimal number of centers for clouds of data with space-varying density. Some examples of the applicatin of these extensions are also given.
Resumo:
A scheme for recognizing 3D objects from single 2D images is introduced. The scheme proceeds in two stages. In the first stage, the categorization stage, the image is compared to prototype objects. For each prototype, the view that most resembles the image is recovered, and, if the view is found to be similar to the image, the class identity of the object is determined. In the second stage, the identification stage, the observed object is compared to the individual models of its class, where classes are expected to contain objects with relatively similar shapes. For each model, a view that matches the image is sought. If such a view is found, the object's specific identity is determined. The advantage of categorizing the object before it is identified is twofold. First, the image is compared to a smaller number of models, since only models that belong to the object's class need to be considered. Second, the cost of comparing the image to each model in a classis very low, because correspondence is computed once for the whoel class. More specifically, the correspondence and object pose computed in the categorization stage to align the prototype with the image are reused in the identification stage to align the individual models with the image. As a result, identification is reduced to a series fo simple template comparisons. The paper concludes with an algorithm for constructing optimal prototypes for classes of objects.
Resumo:
This paper describes a machine vision system that classifies reflectance properties of surfaces such as metal, plastic, or paper, under unknown real-world illumination. We demonstrate performance of our algorithm for surfaces of arbitrary geometry. Reflectance estimation under arbitrary omnidirectional illumination proves highly underconstrained. Our reflectance estimation algorithm succeeds by learning relationships between surface reflectance and certain statistics computed from an observed image, which depend on statistical regularities in the spatial structure of real-world illumination. Although the algorithm assumes known geometry, its statistical nature makes it robust to inaccurate geometry estimates.
Resumo:
The key to understanding a program is recognizing familiar algorithmic fragments and data structures in it. Automating this recognition process will make it easier to perform many tasks which require program understanding, e.g., maintenance, modification, and debugging. This report describes a recognition system, called the Recognizer, which automatically identifies occurrences of stereotyped computational fragments and data structures in programs. The Recognizer is able to identify these familiar fragments and structures, even though they may be expressed in a wide range of syntactic forms. It does so systematically and efficiently by using a parsing technique. Two important advances have made this possible. The first is a language-independent graphical representation for programs and programming structures which canonicalizes many syntactic features of programs. The second is an efficient graph parsing algorithm.
Resumo:
Liu, Yonghuai. Automatic 3d free form shape matching using the graduated assignment algorithm. Pattern Recognition, vol. 38, no. 10, pp. 1615-1631, 2005.
Resumo:
This paper introduces an algorithm that uses boosting to learn a distance measure for multiclass k-nearest neighbor classification. Given a family of distance measures as input, AdaBoost is used to learn a weighted distance measure, that is a linear combination of the input measures. The proposed method can be seen both as a novel way to learn a distance measure from data, and as a novel way to apply boosting to multiclass recognition problems, that does not require output codes. In our approach, multiclass recognition of objects is reduced into a single binary recognition task, defined on triples of objects. Preliminary experiments with eight UCI datasets yield no clear winner among our method, boosting using output codes, and k-nn classification using an unoptimized distance measure. Our algorithm did achieve lower error rates in some of the datasets, which indicates that, in some domains, it may lead to better results than existing methods.
Resumo:
A framework for the simultaneous localization and recognition of dynamic hand gestures is proposed. At the core of this framework is a dynamic space-time warping (DSTW) algorithm, that aligns a pair of query and model gestures in both space and time. For every frame of the query sequence, feature detectors generate multiple hand region candidates. Dynamic programming is then used to compute both a global matching cost, which is used to recognize the query gesture, and a warping path, which aligns the query and model sequences in time, and also finds the best hand candidate region in every query frame. The proposed framework includes translation invariant recognition of gestures, a desirable property for many HCI systems. The performance of the approach is evaluated on a dataset of hand signed digits gestured by people wearing short sleeve shirts, in front of a background containing other non-hand skin-colored objects. The algorithm simultaneously localizes the gesturing hand and recognizes the hand-signed digit. Although DSTW is illustrated in a gesture recognition setting, the proposed algorithm is a general method for matching time series, that allows for multiple candidate feature vectors to be extracted at each time step.
Resumo:
Hand signals are commonly used in applications such as giving instructions to a pilot for airplane take off or direction of a crane operator by a foreman on the ground. A new algorithm for recognizing hand signals from a single camera is proposed. Typically, tracked 2D feature positions of hand signals are matched to 2D training images. In contrast, our approach matches the 2D feature positions to an archive of 3D motion capture sequences. The method avoids explicit reconstruction of the 3D articulated motion from 2D image features. Instead, the matching between the 2D and 3D sequence is done by backprojecting the 3D motion capture data onto 2D. Experiments demonstrate the effectiveness of the approach in an example application: recognizing six classes of basketball referee hand signals in video.
Resumo:
A new deformable shape-based method for color region segmentation is described. The method includes two stages: over-segmentation using a traditional color region segmentation algorithm, followed by deformable model-based region merging via grouping and hypothesis selection. During the second stage, region merging and object identification are executed simultaneously. A statistical shape model is used to estimate the likelihood of region groupings and model hypotheses. The prior distribution on deformation parameters is precomputed using principal component analysis over a training set of region groupings. Once trained, the system autonomously segments deformed shapes from the background, while not merging them with similarly colored adjacent objects. Furthermore, the recovered parametric shape model can be used directly in object recognition and comparison. Experiments in segmentation and image retrieval are reported.
Resumo:
Nearest neighbor retrieval is the task of identifying, given a database of objects and a query object, the objects in the database that are the most similar to the query. Retrieving nearest neighbors is a necessary component of many practical applications, in fields as diverse as computer vision, pattern recognition, multimedia databases, bioinformatics, and computer networks. At the same time, finding nearest neighbors accurately and efficiently can be challenging, especially when the database contains a large number of objects, and when the underlying distance measure is computationally expensive. This thesis proposes new methods for improving the efficiency and accuracy of nearest neighbor retrieval and classification in spaces with computationally expensive distance measures. The proposed methods are domain-independent, and can be applied in arbitrary spaces, including non-Euclidean and non-metric spaces. In this thesis particular emphasis is given to computer vision applications related to object and shape recognition, where expensive non-Euclidean distance measures are often needed to achieve high accuracy. The first contribution of this thesis is the BoostMap algorithm for embedding arbitrary spaces into a vector space with a computationally efficient distance measure. Using this approach, an approximate set of nearest neighbors can be retrieved efficiently - often orders of magnitude faster than retrieval using the exact distance measure in the original space. The BoostMap algorithm has two key distinguishing features with respect to existing embedding methods. First, embedding construction explicitly maximizes the amount of nearest neighbor information preserved by the embedding. Second, embedding construction is treated as a machine learning problem, in contrast to existing methods that are based on geometric considerations. The second contribution is a method for constructing query-sensitive distance measures for the purposes of nearest neighbor retrieval and classification. In high-dimensional spaces, query-sensitive distance measures allow for automatic selection of the dimensions that are the most informative for each specific query object. It is shown theoretically and experimentally that query-sensitivity increases the modeling power of embeddings, allowing embeddings to capture a larger amount of the nearest neighbor structure of the original space. The third contribution is a method for speeding up nearest neighbor classification by combining multiple embedding-based nearest neighbor classifiers in a cascade. In a cascade, computationally efficient classifiers are used to quickly classify easy cases, and classifiers that are more computationally expensive and also more accurate are only applied to objects that are harder to classify. An interesting property of the proposed cascade method is that, under certain conditions, classification time actually decreases as the size of the database increases, a behavior that is in stark contrast to the behavior of typical nearest neighbor classification systems. The proposed methods are evaluated experimentally in several different applications: hand shape recognition, off-line character recognition, online character recognition, and efficient retrieval of time series. In all datasets, the proposed methods lead to significant improvements in accuracy and efficiency compared to existing state-of-the-art methods. In some datasets, the general-purpose methods introduced in this thesis even outperform domain-specific methods that have been custom-designed for such datasets.