342 resultados para Speaker Recognition


Relevância:

20.00% 20.00%

Publicador:

Resumo:

The capability to automatically identify shapes, objects and materials from the image content through direct and indirect methodologies has enabled the development of several civil engineering related applications that assist in the design, construction and maintenance of construction projects. This capability is a product of the technological breakthroughs in the area of Image Processing that has allowed for the development of a large number of digital imaging applications in all industries. In this paper, an automated and content based shape recognition model is presented. This model was devised to enhance the recognition capabilities of our existing material based image retrieval model. The shape recognition model is based on clustering techniques, and specifically those related with material and object segmentation. The model detects the borders of each previously detected material depicted in the image, examines its linearity (length/width ratio) and detects its orientation (horizontal/vertical). The results emonstrate the suitability of this model for construction site image retrieval purposes and reveal the capability of existing clustering technologies to accurately identify the shape of a wealth of materials from construction site images.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

As-built models have been proven useful in many project-related applications, such as progress monitoring and quality control. However, they are not widely produced in most projects because a lot of effort is still necessary to manually convert remote sensing data from photogrammetry or laser scanning to an as-built model. In order to automate the generation of as-built models, the first and fundamental step is to automatically recognize infrastructure-related elements from the remote sensing data. This paper outlines a framework for creating visual pattern recognition models that can automate the recognition of infrastructure-related elements based on their visual features. The framework starts with identifying the visual characteristics of infrastructure element types and numerically representing them using image analysis tools. The derived representations, along with their relative topology, are then used to form element visual pattern recognition (VPR) models. So far, the VPR models of four infrastructure-related elements have been created using the framework. The high recognition performance of these models validates the effectiveness of the framework in recognizing infrastructure-related elements.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Pavement condition assessment is essential when developing road network maintenance programs. In practice, pavement sensing is to a large extent automated when regarding highway networks. Municipal roads, however, are predominantly surveyed manually due to the limited amount of expensive inspection vehicles. As part of a research project that proposes an omnipresent passenger vehicle network for comprehensive and cheap condition surveying of municipal road networks this paper deals with pothole recognition. Existing methods either rely on expensive and high-maintenance range sensors, or make use of acceleration data, which can only provide preliminary and rough condition surveys. In our previous work we created a pothole detection method for pavement images. In this paper we present an improved recognition method for pavement videos that incrementally updates the texture signature for intact pavement regions and uses vision tracking to track detected potholes. The method is tested and results demonstrate its reasonable efficiency.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This book will be of particular interest to academics, researchers, and graduate students at universities and industrial practitioners seeking to apply mobile and pervasive computing systems to improve construction industry productivity.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

For many applications, it is necessary to produce speech transcriptions in a causal fashion. To produce high quality transcripts, speaker adaptation is often used. This requires online speaker clustering and incremental adaptation techniques to be developed. This paper presents an integrated approach to online speaker clustering and adaptation which allows efficient clustering of speakers using the same accumulated statistics that are normally used for adaptation. Using a consistent criterion for both clustering and adaptation should yield gains for both stages. The proposed approach is evaluated on a meetings transcription task using audio from multiple distant microphones. Consistent gains over standard clustering and adaptation were obtained. Copyright © 2011 ISCA.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In spite of over two decades of intense research, illumination and pose invariance remain prohibitively challenging aspects of face recognition for most practical applications. The objective of this work is to recognize faces using video sequences both for training and recognition input, in a realistic, unconstrained setup in which lighting, pose and user motion pattern have a wide variability and face images are of low resolution. The central contribution is an illumination invariant, which we show to be suitable for recognition from video of loosely constrained head motion. In particular there are three contributions: (i) we show how a photometric model of image formation can be combined with a statistical model of generic face appearance variation to exploit the proposed invariant and generalize in the presence of extreme illumination changes; (ii) we introduce a video sequence re-illumination algorithm to achieve fine alignment of two video sequences; and (iii) we use the smoothness of geodesically local appearance manifold structure and a robust same-identity likelihood to achieve robustness to unseen head poses. We describe a fully automatic recognition system based on the proposed method and an extensive evaluation on 323 individuals and 1474 video sequences with extreme illumination, pose and head motion variation. Our system consistently achieved a nearly perfect recognition rate (over 99.7% on all four databases). © 2012 Elsevier Ltd All rights reserved.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This chapter presents a method for vote-based 3D shape recognition and registration, in particular using mean shift on 3D pose votes in the space of direct similarity transformations for the first time. We introduce a new distance between poses in this spacethe SRT distance. It is left-invariant, unlike Euclidean distance, and has a unique, closed-form mean, in contrast to Riemannian distance, so is fast to compute. We demonstrate improved performance over the state of the art in both recognition and registration on a (real and) challenging dataset, by comparing our distance with others in a mean shift framework, as well as with the commonly used Hough voting approach. © 2013 Springer-Verlag Berlin Heidelberg.