42 resultados para Speech Recognition System using LPC
em Consorci de Serveis Universitaris de Catalunya (CSUC), Spain
Resumo:
In this paper we propose the inversion of nonlinear distortions in order to improve the recognition rates of a speaker recognizer system. We study the effect of saturations on the test signals, trying to take into account real situations where the training material has been recorded in a controlled situation but the testing signals present some mismatch with the input signal level (saturations). The experimental results shows that a combination of several strategies can improve the recognition rates with saturated test sentences from 80% to 89.39%, while the results with clean speech (without saturation) is 87.76% for one microphone.
Resumo:
In this work we present a simulation of a recognition process with perimeter characterization of a simple plant leaves as a unique discriminating parameter. Data coding allowing for independence of leaves size and orientation may penalize performance recognition for some varieties. Border description sequences are then used, and Principal Component Analysis (PCA) is applied in order to study which is the best number of components for the classification task, implemented by means of a Support Vector Machine (SVM) System. Obtained results are satisfactory, and compared with [4] our system improves the recognition success, diminishing the variance at the same time.
Resumo:
In this work we explore the multivariate empirical mode decomposition combined with a Neural Network classifier as technique for face recognition tasks. Images are simultaneously decomposed by means of EMD and then the distance between the modes of the image and the modes of the representative image of each class is calculated using three different distance measures. Then, a neural network is trained using 10- fold cross validation in order to derive a classifier. Preliminary results (over 98 % of classification rate) are satisfactory and will justify a deep investigation on how to apply mEMD for face recognition.
Resumo:
Background: Single Nucleotide Polymorphisms, among other type of sequence variants, constitute key elements in genetic epidemiology and pharmacogenomics. While sequence data about genetic variation is found at databases such as dbSNP, clues about the functional and phenotypic consequences of the variations are generally found in biomedical literature. The identification of the relevant documents and the extraction of the information from them are hampered by the large size of literature databases and the lack of widely accepted standard notation for biomedical entities. Thus, automatic systems for the identification of citations of allelic variants of genes in biomedical texts are required. Results: Our group has previously reported the development of OSIRIS, a system aimed at the retrieval of literature about allelic variants of genes http://ibi.imim.es/osirisform.html. Here we describe the development of a new version of OSIRIS (OSIRISv1.2, http://ibi.imim.es/OSIRISv1.2.html webcite) which incorporates a new entity recognition module and is built on top of a local mirror of the MEDLINE collection and HgenetInfoDB: a database that collects data on human gene sequence variations. The new entity recognition module is based on a pattern-based search algorithm for the identification of variation terms in the texts and their mapping to dbSNP identifiers. The performance of OSIRISv1.2 was evaluated on a manually annotated corpus, resulting in 99% precision, 82% recall, and an F-score of 0.89. As an example, the application of the system for collecting literature citations for the allelic variants of genes related to the diseases intracranial aneurysm and breast cancer is presented. Conclusion: OSIRISv1.2 can be used to link literature references to dbSNP database entries with high accuracy, and therefore is suitable for collecting current knowledge on gene sequence variations and supporting the functional annotation of variation databases. The application of OSIRISv1.2 in combination with controlled vocabularies like MeSH provides a way to identify associations of biomedical interest, such as those that relate SNPs with diseases.
Resumo:
In this paper, a new algorithm for blind inversion of Wiener systems is presented. The algorithm is based on minimization of mutual information of the output samples. This minimization is done through a Minimization-Projection (MP) approach, using a nonparametric “gradient” of mutual information.
Resumo:
A Wiener system is a linear time-invariant filter, followed by an invertible nonlinear distortion. Assuming that the input signal is an independent and identically distributed (iid) sequence, we propose an algorithm for estimating the input signal only by observing the output of the Wiener system. The algorithm is based on minimizing the mutual information of the output samples, by means of a steepest descent gradient approach.
Resumo:
As part of the Affective Computing research field, the development of automatic affective recognition systems can enhance human-computer interactions by allowing the creation of interfaces that react to the user's emotional state. To that end, this Master Thesis brings affect recognition to nowadays most used human computer interface, mobile devices, by developing a facial expression recognition system able to perform detection under the difficult conditions of viewing angle and illumination that entails the interaction with a mobile device. Moreover, this Master Thesis proposes to combine emotional features detected from expression with contextual information of the current situation, to infer a complex and extensive emotional state of the user. Thus, a cognitive computational model of emotion is defined that provides a multicomponential affective state of the user through the integration of the detected emotional features into appraisal processes. In order to account for individual differences in the emotional experience, these processes can be adapted to the culture and personality of the user.
Resumo:
En aquest projecte es fa una introducció als reconeixedors de la parla, el seu funcionament i la seva base matemàtica. Un cop tots els conceptes han quedat clars, es mostra el mètode de creació que hem seguit per obtenir el nostre propi reconeixedor de la parla, utilitzant les eines HTK, en català. S’avaluen les seves virtuts i els seus defectes a través de diferents proves realitzades als seus components. A més a més, el projecte arrodoneix la feina implementant un sistema de dictat automàtic que explota el reconeixedor de la parla utilitzant Julius.
Resumo:
We describe a series of experiments in which we start with English to French and English to Japanese versions of an Open Source rule-based speech translation system for a medical domain, and bootstrap correspondign statistical systems. Comparative evaluation reveals that the rule-based systems are still significantly better than the statistical ones, despite the fact that considerable effort has been invested in tuning both the recognition and translation components; also, a hybrid system only marginally improved recall at the cost of a los in precision. The result suggests that rule-based architectures may still be preferable to statistical ones for safety-critical speech translation tasks.
Resumo:
Behavior-based navigation of autonomous vehicles requires the recognition of the navigable areas and the potential obstacles. In this paper we describe a model-based objects recognition system which is part of an image interpretation system intended to assist the navigation of autonomous vehicles that operate in industrial environments. The recognition system integrates color, shape and texture information together with the location of the vanishing point. The recognition process starts from some prior scene knowledge, that is, a generic model of the expected scene and the potential objects. The recognition system constitutes an approach where different low-level vision techniques extract a multitude of image descriptors which are then analyzed using a rule-based reasoning system to interpret the image content. This system has been implemented using a rule-based cooperative expert system
Resumo:
We describe a model-based objects recognition system which is part of an image interpretation system intended to assist autonomous vehicles navigation. The system is intended to operate in man-made environments. Behavior-based navigation of autonomous vehicles involves the recognition of navigable areas and the potential obstacles. The recognition system integrates color, shape and texture information together with the location of the vanishing point. The recognition process starts from some prior scene knowledge, that is, a generic model of the expected scene and the potential objects. The recognition system constitutes an approach where different low-level vision techniques extract a multitude of image descriptors which are then analyzed using a rule-based reasoning system to interpret the image content. This system has been implemented using CEES, the C++ embedded expert system shell developed in the Systems Engineering and Automatic Control Laboratory (University of Girona) as a specific rule-based problem solving tool. It has been especially conceived for supporting cooperative expert systems, and uses the object oriented programming paradigm
Resumo:
Report for the scientific sojourn at the Swiss Federal Institute of Technology Zurich, Switzerland, between September and December 2007. In order to make robots useful assistants for our everyday life, the ability to learn and recognize objects is of essential importance. However, object recognition in real scenes is one of the most challenging problems in computer vision, as it is necessary to deal with difficulties. Furthermore, in mobile robotics a new challenge is added to the list: computational complexity. In a dynamic world, information about the objects in the scene can become obsolete before it is ready to be used if the detection algorithm is not fast enough. Two recent object recognition techniques have achieved notable results: the constellation approach proposed by Lowe and the bag of words approach proposed by Nistér and Stewénius. The Lowe constellation approach is the one currently being used in the robot localization project of the COGNIRON project. This report is divided in two main sections. The first section is devoted to briefly review the currently used object recognition system, the Lowe approach, and bring to light the drawbacks found for object recognition in the context of indoor mobile robot navigation. Additionally the proposed improvements for the algorithm are described. In the second section the alternative bag of words method is reviewed, as well as several experiments conducted to evaluate its performance with our own object databases. Furthermore, some modifications to the original algorithm to make it suitable for object detection in unsegmented images are proposed.
Resumo:
La interacció home-màquina per mitjà de la veu cobreix moltes àrees d’investigació. Es destaquen entre altres, el reconeixement de la parla, la síntesis i identificació de discurs, la verificació i identificació de locutor i l’activació per veu (ordres) de sistemes robòtics. Reconèixer la parla és natural i simple per a les persones, però és un treball complex per a les màquines, pel qual existeixen diverses metodologies i tècniques, entre elles les Xarxes Neuronals. L’objectiu d’aquest treball és desenvolupar una eina en Matlab per al reconeixement i identificació de paraules pronunciades per un locutor, entre un conjunt de paraules possibles, i amb una bona fiabilitat dins d’uns marges preestablerts. El sistema és independent del locutor que pronuncia la paraula, és a dir, aquest locutor no haurà intervingut en el procés d’entrenament del sistema. S’ha dissenyat una interfície que permet l’adquisició del senyal de veu i el seu processament mitjançant xarxes neuronals i altres tècniques. Adaptant una part de control al sistema, es podria utilitzar per donar ordres a un robot com l’Alfa6Uvic o qualsevol altre dispositiu.
Resumo:
El reconeixement dels gestos de la mà (HGR, Hand Gesture Recognition) és actualment un camp important de recerca degut a la varietat de situacions en les quals és necessari comunicar-se mitjançant signes, com pot ser la comunicació entre persones que utilitzen la llengua de signes i les que no. En aquest projecte es presenta un mètode de reconeixement de gestos de la mà a temps real utilitzant el sensor Kinect per Microsoft Xbox, implementat en un entorn Linux (Ubuntu) amb llenguatge de programació Python i utilitzant la llibreria de visió artifical OpenCV per a processar les dades sobre un ordinador portàtil convencional. Gràcies a la capacitat del sensor Kinect de capturar dades de profunditat d’una escena es poden determinar les posicions i trajectòries dels objectes en 3 dimensions, el que implica poder realitzar una anàlisi complerta a temps real d’una imatge o d’una seqüencia d’imatges. El procediment de reconeixement que es planteja es basa en la segmentació de la imatge per poder treballar únicament amb la mà, en la detecció dels contorns, per després obtenir l’envolupant convexa i els defectes convexos, que finalment han de servir per determinar el nombre de dits i concloure en la interpretació del gest; el resultat final és la transcripció del seu significat en una finestra que serveix d’interfície amb l’interlocutor. L’aplicació permet reconèixer els números del 0 al 5, ja que s’analitza únicament una mà, alguns gestos populars i algunes de les lletres de l’alfabet dactilològic de la llengua de signes catalana. El projecte és doncs, la porta d’entrada al camp del reconeixement de gestos i la base d’un futur sistema de reconeixement de la llengua de signes capaç de transcriure tant els signes dinàmics com l’alfabet dactilològic.
Resumo:
Semantic Web technology is able to provide the required computational semantics for interoperability of learning resources across different Learning Management Systems (LMS) and Learning Object Repositories (LOR). The EU research project LUISA (Learning Content Management System Using Innovative Semantic Web Services Architecture) addresses the development of a reference semantic architecture for the major challenges in the search, interchange and delivery of learning objects in a service-oriented context. One of the key issues, highlighted in this paper, is Digital Rights Management (DRM) interoperability. A Semantic Web approach to copyright management has been followed, which places a Copyright Ontology as the key component for interoperability among existing DRM systems and other licensing schemes like Creative Commons. Moreover, Semantic Web tools like reasoners, rule engines and semantic queries facilitate the implementation of an interoperable copyright management component in the LUISA architecture.