31 resultados para speaker recognition systems

em Consorci de Serveis Universitaris de Catalunya (CSUC), Spain


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Biometric system performance can be improved by means of data fusion. Several kinds of information can be fused in order to obtain a more accurate classification (identification or verification) of an input sample. In this paper we present a method for computing the weights in a weighted sum fusion for score combinations, by means of a likelihood model. The maximum likelihood estimation is set as a linear programming problem. The scores are derived from a GMM classifier working on a different feature extractor. Our experimental results assesed the robustness of the system in front a changes on time (different sessions) and robustness in front a change of microphone. The improvements obtained were significantly better (error bars of two standard deviations) than a uniform weighted sum or a uniform weighted product or the best single classifier. The proposed method scales computationaly with the number of scores to be fussioned as the simplex method for linear programming.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper we propose the inversion of nonlinear distortions in order to improve the recognition rates of a speaker recognizer system. We study the effect of saturations on the test signals, trying to take into account real situations where the training material has been recorded in a controlled situation but the testing signals present some mismatch with the input signal level (saturations). The experimental results for speaker recognition shows that a combination of several strategies can improve the recognition rates with saturated test sentences from 80% to 89.39%, while the results with clean speech (without saturation) is 87.76% for one microphone, and for speaker identification can reduce the minimum detection cost function with saturated test sentences from 6.42% to 4.15%, while the results with clean speech (without saturation) is 5.74% for one microphone and 7.02% for the other one.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper we propose the inversion of nonlinear distortions in order to improve the recognition rates of a speaker recognizer system. We study the effect of saturations on the test signals, trying to take into account real situations where the training material has been recorded in a controlled situation but the testing signals present some mismatch with the input signal level (saturations). The experimental results shows that a combination of several strategies can improve the recognition rates with saturated test sentences from 80% to 89.39%, while the results with clean speech (without saturation) is 87.76% for one microphone.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Behavior-based navigation of autonomous vehicles requires the recognition of the navigable areas and the potential obstacles. In this paper we describe a model-based objects recognition system which is part of an image interpretation system intended to assist the navigation of autonomous vehicles that operate in industrial environments. The recognition system integrates color, shape and texture information together with the location of the vanishing point. The recognition process starts from some prior scene knowledge, that is, a generic model of the expected scene and the potential objects. The recognition system constitutes an approach where different low-level vision techniques extract a multitude of image descriptors which are then analyzed using a rule-based reasoning system to interpret the image content. This system has been implemented using a rule-based cooperative expert system

Relevância:

90.00% 90.00%

Publicador:

Resumo:

We describe a model-based objects recognition system which is part of an image interpretation system intended to assist autonomous vehicles navigation. The system is intended to operate in man-made environments. Behavior-based navigation of autonomous vehicles involves the recognition of navigable areas and the potential obstacles. The recognition system integrates color, shape and texture information together with the location of the vanishing point. The recognition process starts from some prior scene knowledge, that is, a generic model of the expected scene and the potential objects. The recognition system constitutes an approach where different low-level vision techniques extract a multitude of image descriptors which are then analyzed using a rule-based reasoning system to interpret the image content. This system has been implemented using CEES, the C++ embedded expert system shell developed in the Systems Engineering and Automatic Control Laboratory (University of Girona) as a specific rule-based problem solving tool. It has been especially conceived for supporting cooperative expert systems, and uses the object oriented programming paradigm

Relevância:

90.00% 90.00%

Publicador:

Resumo:

As part of the Affective Computing research field, the development of automatic affective recognition systems can enhance human-computer interactions by allowing the creation of interfaces that react to the user's emotional state. To that end, this Master Thesis brings affect recognition to nowadays most used human computer interface, mobile devices, by developing a facial expression recognition system able to perform detection under the difficult conditions of viewing angle and illumination that entails the interaction with a mobile device. Moreover, this Master Thesis proposes to combine emotional features detected from expression with contextual information of the current situation, to infer a complex and extensive emotional state of the user. Thus, a cognitive computational model of emotion is defined that provides a multicomponential affective state of the user through the integration of the detected emotional features into appraisal processes. In order to account for individual differences in the emotional experience, these processes can be adapted to the culture and personality of the user.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Informe de investigación realizado a partir de una estancia en el Équipe de Recherche en Syntaxe et Sémantique de la Université de Toulouse-Le Mirail, Francia, entre julio y setiembre de 2006. En la actualidad existen diversos diccionarios de siglas en línea. Entre ellos sobresalen Acronym Finder, Abbreviations.com y Acronyma; todos ellos dedicados mayoritariamente a las siglas inglesas. Al igual que los diccionarios en papel, este tipo de diccionarios presenta problemas de desactualización por la gran cantidad de siglas que se crean a diario. Por ejemplo, en 2001, un estudio de Pustejovsky et al. mostraba que en los abstracts de Medline aparecían mensualmente cerca de 12.000 nuevas siglas. El mecanismo de actualización empleado por estos recursos es la remisión de nuevas siglas por parte de los usuarios. Sin embargo, esta técnica tiene la desventaja de que la edición de la información es muy lenta y costosa. Un ejemplo de ello es el caso de Abbreviations.com que en octubre de 2006 tenía alrededor de 100.000 siglas pendientes de edición e incorporación definitiva. Como solución a este tipo de problema, se plantea el diseño de sistemas de detección y extracción automática de siglas a partir de corpus. El proceso de detección comporta dos pasos; el primero, consiste en la identificación de las siglas dentro de un corpus y, el segundo, la desambiguación, es decir, la selección de la forma desarrollada apropiada de una sigla en un contexto dado. En la actualidad, los sistemas de detección de siglas emplean métodos basados en patrones, estadística, aprendizaje máquina, o combinaciones de ellos. En este estudio se analizan los principales sistemas de detección y desambiguación de siglas y los métodos que emplean. Cada uno se evalúa desde el punto de vista del rendimiento, medido en términos de precisión (porcentaje de siglas correctas con respecto al número total de siglas extraídas por el sistema) y exhaustividad (porcentaje de siglas correctas identificadas por el sistema con respecto al número total de siglas existente en el corpus). Como resultado, se presentan los criterios para el diseño de un futuro sistema de detección de siglas en español.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Aplicación que realiza un entrenamiento de puntos de interés en rostros a partir de una colección de imágenes, posteriormente se puede verificar el resultado. Dada una imagen se comprueba el porcentaje de aciertos.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

En esta memoria expone el trabajo que se ha llevado a cabo para intentar crear un sistema de reconocimiento facial.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

We investigate whether dimensionality reduction using a latent generative model is beneficial for the task of weakly supervised scene classification. In detail, we are given a set of labeled images of scenes (for example, coast, forest, city, river, etc.), and our objective is to classify a new image into one of these categories. Our approach consists of first discovering latent ";topics"; using probabilistic Latent Semantic Analysis (pLSA), a generative model from the statistical text literature here applied to a bag of visual words representation for each image, and subsequently, training a multiway classifier on the topic distribution vector for each image. We compare this approach to that of representing each image by a bag of visual words vector directly and training a multiway classifier on these vectors. To this end, we introduce a novel vocabulary using dense color SIFT descriptors and then investigate the classification performance under changes in the size of the visual vocabulary, the number of latent topics learned, and the type of discriminative classifier used (k-nearest neighbor or SVM). We achieve superior classification performance to recent publications that have used a bag of visual word representation, in all cases, using the authors' own data sets and testing protocols. We also investigate the gain in adding spatial information. We show applications to image retrieval with relevance feedback and to scene classification in videos

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Positioning a robot with respect to objects by using data provided by a camera is a well known technique called visual servoing. In order to perform a task, the object must exhibit visual features which can be extracted from different points of view. Then, visual servoing is object-dependent as it depends on the object appearance. Therefore, performing the positioning task is not possible in presence of nontextured objets or objets for which extracting visual features is too complex or too costly. This paper proposes a solution to tackle this limitation inherent to the current visual servoing techniques. Our proposal is based on the coded structured light approach as a reliable and fast way to solve the correspondence problem. In this case, a coded light pattern is projected providing robust visual features independently of the object appearance

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Desenvolupament una aplicació informàtica basada en un sistema de visió per computador, la qual permeti donar una resposta en forma d'informació a partir d'una query d'una imatge que conté una escena o objecte en concret de manera que permeti reconèixer els objectes que apareixen en una imatge per llavors donar informació referent al contingut de la imatge a l’usuari que ha fet la consulta. Resumint, es tracta d’analitzar, dissenyar i construir un sistemade visió per computador capaç de reconèixer objectes d’interès en imatges

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Aquest paper es divideix en 3 parts fonamentals, la primera relata el que pretén mostrar aquest estudi, que és aplicar els sistemes actuals de reconeixement facial en una base de dades d'obres d'art. Explica quins mètodes s'utilitzaran i perquè es interessant realitzar aquest estudi. La segona passa a mostrar el detall de les dades obtingudes en l'experiment, amb imatges i gràfics que facilitaran la comprensió. I en l'última part tenim la discussió dels resultats obtinguts en l'anàlisi i les seves posteriors conclusions.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This paper presents a pattern recognition method focused on paintings images. The purpose is construct a system able to recognize authors or art styles based on common elements of his work (here called patterns). The method is based on comparing images that contain the same or similar patterns. It uses different computer vision techniques, like SIFT and SURF, to describe the patterns in descriptors, K-Means to classify and simplify these descriptors, and RANSAC to determine and detect good results. The method are good to find patterns of known images but not so good if they are not.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Given a set of images of scenes containing different object categories (e.g. grass, roads) our objective is to discover these objects in each image, and to use this object occurrences to perform a scene classification (e.g. beach scene, mountain scene). We achieve this by using a supervised learning algorithm able to learn with few images to facilitate the user task. We use a probabilistic model to recognise the objects and further we classify the scene based on their object occurrences. Experimental results are shown and evaluated to prove the validity of our proposal. Object recognition performance is compared to the approaches of He et al. (2004) and Marti et al. (2001) using their own datasets. Furthermore an unsupervised method is implemented in order to evaluate the advantages and disadvantages of our supervised classification approach versus an unsupervised one