295 resultados para mutual recognition


Relevância:

20.00% 20.00%

Publicador:

Resumo:

This chapter presents a method for vote-based 3D shape recognition and registration, in particular using mean shift on 3D pose votes in the space of direct similarity transformations for the first time. We introduce a new distance between poses in this spacethe SRT distance. It is left-invariant, unlike Euclidean distance, and has a unique, closed-form mean, in contrast to Riemannian distance, so is fast to compute. We demonstrate improved performance over the state of the art in both recognition and registration on a (real and) challenging dataset, by comparing our distance with others in a mean shift framework, as well as with the commonly used Hough voting approach. © 2013 Springer-Verlag Berlin Heidelberg.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Mandarin Chinese is based on characters which are syllabic in nature and morphological in meaning. All spoken languages have syllabiotactic rules which govern the construction of syllables and their allowed sequences. These constraints are not as restrictive as those learned from word sequences, but they can provide additional useful linguistic information. Hence, it is possible to improve speech recognition performance by appropriately combining these two types of constraints. For the Chinese language considered in this paper, character level language models (LMs) can be used as a first level approximation to allowed syllable sequences. To test this idea, word and character level n-gram LMs were trained on 2.8 billion words (equivalent to 4.3 billion characters) of texts from a wide collection of text sources. Both hypothesis and model based combination techniques were investigated to combine word and character level LMs. Significant character error rate reductions up to 7.3% relative were obtained on a state-of-the-art Mandarin Chinese broadcast audio recognition task using an adapted history dependent multi-level LM that performs a log-linearly combination of character and word level LMs. This supports the hypothesis that character or syllable sequence models are useful for improving Mandarin speech recognition performance.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper discusses user target intention recognition algorithms for pointing - clicking tasks to reduce users' pointing time and difficulty. Predicting targets by comparing the bearing angles to targets proposed as one of the first algorithms [1] is compared with a Kalman Filter prediction algorithm. Accuracy and sensitivity of prediction are used as performance criteria. The outcomes of a standard point and click experiment are used for performance comparison, collected from both able-bodied and impaired users. © 2013 Springer-Verlag Berlin Heidelberg.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We present Multi Scale Shape Index (MSSI), a novel feature for 3D object recognition. Inspired by the scale space filtering theory and Shape Index measure proposed by Koenderink & Van Doorn [6], this feature associates different forms of shape, such as umbilics, saddle regions, parabolic regions to a real valued index. This association is useful for representing an object based on its constituent shape forms. We derive closed form scale space equations which computes a characteristic scale at each 3D point in a point cloud without an explicit mesh structure. This characteristic scale is then used to estimate the Shape Index. We quantitatively evaluate the robustness and repeatability of the MSSI feature for varying object scales and changing point cloud density. We also quantify the performance of MSSI for object category recognition on a publicly available dataset. © 2013 Springer-Verlag.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Large margin criteria and discriminative models are two effective improvements for HMM-based speech recognition. This paper proposed a large margin trained log linear model with kernels for CSR. To avoid explicitly computing in the high dimensional feature space and to achieve the nonlinear decision boundaries, a kernel based training and decoding framework is proposed in this work. To make the system robust to noise a kernel adaptation scheme is also presented. Previous work in this area is extended in two directions. First, most kernels for CSR focus on measuring the similarity between two observation sequences. The proposed joint kernels defined a similarity between two observation-label sequence pairs on the sentence level. Second, this paper addresses how to efficiently employ kernels in large margin training and decoding with lattices. To the best of our knowledge, this is the first attempt at using large margin kernel-based log linear models for CSR. The model is evaluated on a noise corrupted continuous digit task: AURORA 2.0. © 2013 IEEE.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Current methods for formation of detected chess-board vertices into a grid structure tend to be weak in situations with a warped grid, and false and missing vertex-features. In this paper we present a highly robust, yet efficient, scheme suitable for inference of regular 2D square mesh structure from vertices recorded both during projection of a chess-board pattern onto 3D objects, and in the more simple case of camera calibration. Examples of the method's performance in a lung function measuring application, observing chess-boards projected on to patients' chests, are given. The method presented is resilient to significant surface deformation, and tolerates inexact vertex-feature detection. This robustness results from the scheme's novel exploitation of feature orientation information. © 2013 IEEE.