15 resultados para hidden markov model (HMM)

em Universidad Politécnica de Madrid


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Although most of the research on Cognitive Radio is focused on communication bands above the HF upper limit (30 MHz), Cognitive Radio principles can also be applied to HF communications to make use of the extremely scarce spectrum more efficiently. In this work we consider legacy users as primary users since these users transmit without resorting to any smart procedure, and our stations using the HFDVL (HF Data+Voice Link) architecture as secondary users. Our goal is to enhance an efficient use of the HF band by detecting the presence of uncoordinated primary users and avoiding collisions with them while transmitting in different HF channels using our broad-band HF transceiver. A model of the primary user activity dynamics in the HF band is developed in this work to make short-term predictions of the sojourn time of a primary user in the band and avoid collisions. It is based on Hidden Markov Models (HMM) which are a powerful tool for modelling stochastic random processes and are trained with real measurements of the 14 MHz band. By using the proposed HMM based model, the prediction model achieves an average 10.3% prediction error rate with one minute-long channel knowledge but it can be reduced when this knowledge is extended: with the previous 8 min knowledge, an average 5.8% prediction error rate is achieved. These results suggest that the resulting activity model for the HF band could actually be used to predict primary users activity and included in a future HF cognitive radio based station.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Cognitive Radio principles can be applied to HF communications to make a more efficient use of the extremely scarce spectrum. In this contribution we focus on analyzing the usage of the available channels done by the legacy users, which are regarded as primary users since they are allowed to transmit without resorting any smart procedure, and consider the possibilities for our stations -over the HFDVL (HF Data+Voice Link) architecture- to participate as secondary users. Our goal is to enhance an efficient use of the HF band by detecting the presence of uncoordinated primary users and avoiding collisions with them while transmitting in different HF channels using our broad-band HF transceiver. A model of the primary user activity dynamics in the HF band is developed in this work. It is based on Hidden Markov Models (HMM) which are a powerful tool for modelling stochastic random processes, and is trained with real measurements from the 14 MHz band.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Este trabajo de Tesis ha abordado el objetivo de dar robustez y mejorar la Detección de Actividad de Voz en entornos acústicos adversos con el fin de favorecer el comportamiento de muchas aplicaciones vocales, por ejemplo aplicaciones de telefonía basadas en reconocimiento automático de voz, aplicaciones en sistemas de transcripción automática, aplicaciones en sistemas multicanal, etc. En especial, aunque se han tenido en cuenta todos los tipos de ruido, se muestra especial interés en el estudio de las voces de fondo, principal fuente de error de la mayoría de los Detectores de Actividad en la actualidad. Las tareas llevadas a cabo poseen como punto de partida un Detector de Actividad basado en Modelos Ocultos de Markov, cuyo vector de características contiene dos componentes: la energía normalizada y la variación de la energía. Las aportaciones fundamentales de esta Tesis son las siguientes: 1) ampliación del vector de características de partida dotándole así de información espectral, 2) ajuste de los Modelos Ocultos de Markov al entorno y estudio de diferentes topologías y, finalmente, 3) estudio e inclusión de nuevas características, distintas de las del punto 1, para filtrar los pulsos de pronunciaciones que proceden de las voces de fondo. Los resultados de detección, teniendo en cuenta los tres puntos anteriores, muestran con creces los avances realizados y son significativamente mejores que los resultados obtenidos, bajo las mismas condiciones, con otros detectores de actividad de referencia. This work has been focused on improving the robustness at Voice Activity Detection in adverse acoustic environments in order to enhance the behavior of many vocal applications, for example telephony applications based on automatic speech recognition, automatic transcription applications, multichannel systems applications, and so on. In particular, though all types of noise have taken into account, this research has special interest in the study of pronunciations coming from far-field speakers, the main error source of most activity detectors today. The tasks carried out have, as starting point, a Hidden Markov Models Voice Activity Detector which a feature vector containing two components: normalized energy and delta energy. The key points of this Thesis are the following: 1) feature vector extension providing spectral information, 2) Hidden Markov Models adjustment to environment and study of different Hidden Markov Model topologies and, finally, 3) study and inclusion of new features, different from point 1, to reject the pronunciations coming from far-field speakers. Detection results, taking into account the above three points, show the advantages of using this method and are significantly better than the results obtained under the same conditions by other well-known voice activity detectors.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

En esta tesis doctoral se propone una técnica biométrica de verificación en teléfonos móviles consistente en realizar una firma en el aire con la mano que sujeta el teléfono móvil. Los acelerómetros integrados en el dispositivo muestrean las aceleraciones del movimiento de la firma en el aire, generando tres señales temporales que pueden utilizarse para la verificación del usuario. Se proponen varios enfoques para la implementación del sistema de verificación, a partir de los enfoques más utilizados en biometría de firma manuscrita: correspondencia de patrones, con variantes de los algoritmos de Needleman-Wusch (NW) y Dynamic Time Warping (DTW), modelos ocultos de Markov (HMM) y clasificador estadístico basado en Máquinas de Vector Soporte (SVM). Al no existir bases de datos públicas de firmas en el aire y con el fin de evaluar los métodos propuestos en esta tesis doctoral, se han capturado dos con distintas características; una con falsificaciones reales a partir del estudio de las grabaciones de usuarios auténticos y otra con muestras de usuarios obtenidas en diferentes sesiones a lo largo del tiempo. Utilizando estas bases de datos se han evaluado una gran cantidad de algoritmos para implementar un sistema de verificación basado en firma en el aire. Esta evaluación se ha realizado de acuerdo con el estándar ISO/IEC 19795, añadiendo el caso de verificación en mundo abierto no incluido en la norma. Además, se han analizado las características que hacen que una firma sea suficientemente segura. Por otro lado, se ha estudiado la permanencia de las firmas en el aire a lo largo del tiempo, proponiendo distintos métodos de actualización, basados en una adaptación dinámica del patrón, para mejorar su rendimiento. Finalmente, se ha implementado un prototipo de la técnica de firma en el aire para teléfonos Android e iOS. Los resultados de esta tesis doctoral han tenido un gran impacto, generando varias publicaciones en revistas internacionales, congresos y libros. La firma en el aire ha sido nombrada también en varias revistas de divulgación, portales de noticias Web y televisión. Además, se han obtenido varios premios en competiciones de ideas innovadoras y se ha firmado un acuerdo de explotación de la tecnología con una empresa extranjera. ABSTRACT This thesis proposes a biometric verification technique on mobile phones consisting on making a signature in the air with the hand holding a mobile phone. The accelerometers integrated in the device capture the movement accelerations, generating three temporal signals that can be used for verification. This thesis suggests several approaches for implementing the verification system, based on the most widely used approaches in handwritten signature biometrics: template matching, with a lot of variations of the Needleman- Wusch (NW) and Dynamic Time Warping (DTW) algorithms, Hidden Markov Models (HMM) and Supported Vector Machines (SVM). As there are no public databases of in-air signatures and with the aim of assessing the proposed methods, there have been captured two databases; one. with real falsification attempts from the study of recordings captured when genuine users made their signatures in front of a camera, and other, with samples obtained in different sessions over a long period of time. These databases have been used to evaluate a lot of algorithms in order to implement a verification system based on in-air signatures. This evaluation has been conducted according to the standard ISO/IEC 19795, adding the open-set verification scenario not included in the norm. In addition, the characteristics of a secure signature are also investigated, as well as the permanence of in-air signatures over time, proposing several updating strategies to improve its performance. Finally, a prototype of in-air signature has been developed for iOS and Android phones. The results of this thesis have achieved a high impact, publishing several articles in SCI journals, conferences and books. The in-air signature deployed in this thesis has been also referred in numerous media. Additionally, this technique has won several awards in the entrepreneurship field and also an exploitation agreement has been signed with a foreign company.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

For most of us, speaking in a non-native language involves deviating to some extent from native pronunciation norms. However, the detailed basis for foreign accent (FA) remains elusive, in part due to methodological challenges in isolating segmental from suprasegmental factors. The current study examines the role of segmental features in conveying FA through the use of a generative approach in which accent is localised to single consonantal segments. Three techniques are evaluated: the first requires a highly-proficiency bilingual to produce words with isolated accented segments; the second uses cross-splicing of context-dependent consonants from the non-native language into native words; the third employs hidden Markov model synthesis to blend voice models for both languages. Using English and Spanish as the native/non-native languages respectively, listener cohorts from both languages identified words and rated their degree of FA. All techniques were capable of generating accented words, but to differing degrees. Naturally-produced speech led to the strongest FA ratings and synthetic speech the weakest, which we interpret as the outcome of over-smoothing. Nevertheless, the flexibility offered by synthesising localised accent encourages further development of the method.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Human Activity Recognition (HAR) is an emerging research field with the aim to identify the actions carried out by a person given a set of observations and the surrounding environment. The wide growth in this research field inside the scientific community is mainly explained by the high number of applications that are arising in the last years. A great part of the most promising applications are related to the healthcare field, where it is possible to track the mobility of patients with motor dysfunction as also the physical activity in patients with cardiovascular risk. Until a few years ago, by using distinct kind of sensors, a patient follow-up was possible. However, far from being a long-term solution and with the smartphone irruption, that monitoring can be achieved in a non-invasive way by using the embedded smartphone’s sensors. For these reasons this Final Degree Project arises with the main target to evaluate new feature extraction techniques in order to carry out an activity and user recognition, and also an activity segmentation. The recognition is done thanks to the inertial signals integration obtained by two widespread sensors in the greater part of smartphones: accelerometer and gyroscope. In particular, six different activities are evaluated walking, walking-upstairs, walking-downstairs, sitting, standing and lying. Furthermore, a segmentation task is carried out taking into account the activities performed by thirty users. This can be done by using Hidden Markov Models and also a set of tools tested satisfactory in speech recognition: HTK (Hidden Markov Model Toolkit).

Relevância:

100.00% 100.00%

Publicador:

Resumo:

El Reconocimiento de Actividades Humanas es un área de investigación emergente, cuyo objetivo principal es identificar las acciones realizadas por un sujeto analizando las señales obtenidas a partir de unos sensores. El rápido crecimiento de este área de investigación dentro de la comunidad científica se explica, en parte, por el elevado número de aplicaciones que están surgiendo en los últimos años. Gran parte de las aplicaciones más prometedoras se encuentran en el campo de la salud, donde se puede hacer un seguimiento del nivel de movilidad de pacientes con trastornos motores, así como monitorizar el nivel de actividad física en pacientes con riesgo cardiovascular. Hasta hace unos años, mediante el uso de distintos tipos de sensores se podía hacer un seguimiento del paciente. Sin embargo, lejos de ser una solución a largo plazo y gracias a la irrupción del teléfono inteligente, este seguimiento se puede hacer de una manera menos invasiva, haciendo uso de la gran variedad de sensores integrados en este tipo de dispositivos. En este contexto nace este Trabajo de Fin de Grado, cuyo principal objetivo es evaluar nuevas técnicas de extracción de características para llevar a cabo un reconocimiento de actividades y usuarios así como una segmentación de aquellas. Este reconocimiento se hace posible mediante la integración de señales inerciales obtenidas por dos sensores presentes en la gran mayoría de teléfonos inteligentes: acelerómetro y giróscopo. Concretamente, se evalúan seis tipos de actividades realizadas por treinta usuarios: andar, subir escaleras, bajar escaleras, estar sentado, estar de pie y estar tumbado. Además y de forma paralela, se realiza una segmentación temporal de los distintos tipos de actividades realizadas por dichos usuarios. Todo ello se llevará a cabo haciendo uso de los Modelos Ocultos de Markov, así como de un conjunto de herramientas probadas satisfactoriamente en reconocimiento del habla: HTK (Hidden Markov Model Toolkit).

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Light detection and ranging (LiDAR) technology is beginning to have an impact on agriculture. Canopy volume and/or fruit tree leaf area can be estimated using terrestrial laser sensors based on this technology. However, the use of these devices may have different options depending on the resolution and scanning mode. As a consequence, data accuracy and LiDAR derived parameters are affected by sensor configuration, and may vary according to vegetative characteristics of tree crops. Given this scenario, users and suppliers of these devices need to know how to use the sensor in each case. This paper presents a computer program to determine the best configuration, allowing simulation and evaluation of different LiDAR configurations in various tree structures (or training systems). The ultimate goal is to optimise the use of laser scanners in field operations. The software presented generates a virtual orchard, and then allows the scanning simulation with a laser sensor. Trees are created using a hidden Markov tree (HMT) model. Varying the foliar structure of the orchard the LiDAR simulation was applied to twenty different artificially created orchards with or without leaves from two positions (lateral and zenith). To validate the laser sensor configuration, leaf surface of simulated trees was compared with the parameters obtained by LiDAR measurements: the impacted leaf area, the impacted total area (leaves and wood), and th impacted area in the three outer layers of leaves.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We present MBIS (Multivariate Bayesian Image Segmentation tool), a clustering tool based on the mixture of multivariate normal distributions model. MBIS supports multi-channel bias field correction based on a B-spline model. A second methodological novelty is the inclusion of graph-cuts optimization for the stationary anisotropic hidden Markov random field model. Along with MBIS, we release an evaluation framework that contains three different experiments on multi-site data. We first validate the accuracy of segmentation and the estimated bias field for each channel. MBIS outperforms a widely used segmentation tool in a cross-comparison evaluation. The second experiment demonstrates the robustness of results on atlas-free segmentation of two image sets from scan-rescan protocols on 21 healthy subjects. Multivariate segmentation is more replicable than the monospectral counterpart on T1-weighted images. Finally, we provide a third experiment to illustrate how MBIS can be used in a large-scale study of tissue volume change with increasing age in 584 healthy subjects. This last result is meaningful as multivariate segmentation performs robustly without the need for prior knowledge.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Introduction Diffusion weighted Imaging (DWI) techniques are able to measure, in vivo and non-invasively, the diffusivity of water molecules inside the human brain. DWI has been applied on cerebral ischemia, brain maturation, epilepsy, multiple sclerosis, etc. [1]. Nowadays, there is a very high availability of these images. DWI allows the identification of brain tissues, so its accurate segmentation is a common initial step for the referred applications. Materials and Methods We present a validation study on automated segmentation of DWI based on the Gaussian mixture and hidden Markov random field models. This methodology is widely solved with iterative conditional modes algorithm, but some studies suggest [2] that graph-cuts (GC) algorithms improve the results when initialization is not close to the final solution. We implemented a segmentation tool integrating ITK with a GC algorithm [3], and a validation software using fuzzy overlap measures [4]. Results Segmentation accuracy of each tool is tested against a gold-standard segmentation obtained from a T1 MPRAGE magnetic resonance image of the same subject, registered to the DWI space. The proposed software shows meaningful improvements by using the GC energy minimization approach on DTI and DSI (Diffusion Spectrum Imaging) data. Conclusions The brain tissues segmentation on DWI is a fundamental step on many applications. Accuracy and robustness improvements are achieved with the proposed software, with high impact on the application’s final result.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper presents some ideas about a new neural network architecture that can be compared to a Taylor analysis when dealing with patterns. Such architecture is based on lineal activation functions with an axo-axonic architecture. A biological axo-axonic connection between two neurons is defined as the weight in a connection in given by the output of another third neuron. This idea can be implemented in the so called Enhanced Neural Networks in which two Multilayer Perceptrons are used; the first one will output the weights that the second MLP uses to computed the desired output. This kind of neural network has universal approximation properties even with lineal activation functions. There exists a clear difference between cooperative and competitive strategies. The former ones are based on the swarm colonies, in which all individuals share its knowledge about the goal in order to pass such information to other individuals to get optimum solution. The latter ones are based on genetic models, that is, individuals can die and new individuals are created combining information of alive one; or are based on molecular/celular behaviour passing information from one structure to another. A swarm-based model is applied to obtain the Neural Network, training the net with a Particle Swarm algorithm.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Probabilistic modeling is the de�ning characteristic of estimation of distribution algorithms (EDAs) which determines their behavior and performance in optimization. Regularization is a well-known statistical technique used for obtaining an improved model by reducing the generalization error of estimation, especially in high-dimensional problems. `1-regularization is a type of this technique with the appealing variable selection property which results in sparse model estimations. In this thesis, we study the use of regularization techniques for model learning in EDAs. Several methods for regularized model estimation in continuous domains based on a Gaussian distribution assumption are presented, and analyzed from di�erent aspects when used for optimization in a high-dimensional setting, where the population size of EDA has a logarithmic scale with respect to the number of variables. The optimization results obtained for a number of continuous problems with an increasing number of variables show that the proposed EDA based on regularized model estimation performs a more robust optimization, and is able to achieve signi�cantly better results for larger dimensions than other Gaussian-based EDAs. We also propose a method for learning a marginally factorized Gaussian Markov random �eld model using regularization techniques and a clustering algorithm. The experimental results show notable optimization performance on continuous additively decomposable problems when using this model estimation method. Our study also covers multi-objective optimization and we propose joint probabilistic modeling of variables and objectives in EDAs based on Bayesian networks, speci�cally models inspired from multi-dimensional Bayesian network classi�ers. It is shown that with this approach to modeling, two new types of relationships are encoded in the estimated models in addition to the variable relationships captured in other EDAs: objectivevariable and objective-objective relationships. An extensive experimental study shows the e�ectiveness of this approach for multi- and many-objective optimization. With the proposed joint variable-objective modeling, in addition to the Pareto set approximation, the algorithm is also able to obtain an estimation of the multi-objective problem structure. Finally, the study of multi-objective optimization based on joint probabilistic modeling is extended to noisy domains, where the noise in objective values is represented by intervals. A new version of the Pareto dominance relation for ordering the solutions in these problems, namely �-degree Pareto dominance, is introduced and its properties are analyzed. We show that the ranking methods based on this dominance relation can result in competitive performance of EDAs with respect to the quality of the approximated Pareto sets. This dominance relation is then used together with a method for joint probabilistic modeling based on `1-regularization for multi-objective feature subset selection in classi�cation, where six di�erent measures of accuracy are considered as objectives with interval values. The individual assessment of the proposed joint probabilistic modeling and solution ranking methods on datasets with small-medium dimensionality, when using two di�erent Bayesian classi�ers, shows that comparable or better Pareto sets of feature subsets are approximated in comparison to standard methods.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We apply diffusion strategies to propose a cooperative reinforcement learning algorithm, in which agents in a network communicate with their neighbors to improve predictions about their environment. The algorithm is suitable to learn off-policy even in large state spaces. We provide a mean-square-error performance analysis under constant step-sizes. The gain of cooperation in the form of more stability and less bias and variance in the prediction error, is illustrated in the context of a classical model. We show that the improvement in performance is especially significant when the behavior policy of the agents is different from the target policy under evaluation.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this study, a method for vehicle tracking through video analysis based on Markov chain Monte Carlo (MCMC) particle filtering with metropolis sampling is proposed. The method handles multiple targets with low computational requirements and is, therefore, ideally suited for advanced-driver assistance systems that involve real-time operation. The method exploits the removed perspective domain given by inverse perspective mapping (IPM) to define a fast and efficient likelihood model. Additionally, the method encompasses an interaction model using Markov Random Fields (MRF) that allows treatment of dependencies between the motions of targets. The proposed method is tested in highway sequences and compared to state-of-the-art methods for vehicle tracking, i.e., independent target tracking with Kalman filtering (KF) and joint tracking with particle filtering. The results showed fewer tracking failures using the proposed method.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper proposes an emotion transplantation method capable of modifying a synthetic speech model through the use of CSMAPLR adaptation in order to incorporate emotional information learned from a different speaker model while maintaining the identity of the original speaker as much as possible. The proposed method relies on learning both emotional and speaker identity information by means of their adaptation function from an average voice model, and combining them into a single cascade transform capable of imbuing the desired emotion into the target speaker. This method is then applied to the task of transplanting four emotions (anger, happiness, sadness and surprise) into 3 male speakers and 3 female speakers and evaluated in a number of perceptual tests. The results of the evaluations show how the perceived naturalness for emotional text significantly favors the use of the proposed transplanted emotional speech synthesis when compared to traditional neutral speech synthesis, evidenced by a big increase in the perceived emotional strength of the synthesized utterances at a slight cost in speech quality. A final evaluation with a robotic laboratory assistant application shows how by using emotional speech we can significantly increase the students’ satisfaction with the dialog system, proving how the proposed emotion transplantation system provides benefits in real applications.