3 resultados para Unstressed Vowels
em Universidad Politécnica de Madrid
Resumo:
We present a novel approach using both sustained vowels and connected speech, to detect obstructive sleep apnea (OSA) cases within a homogeneous group of speakers. The proposed scheme is based on state-of-the-art GMM-based classifiers, and acknowledges specifically the way in which acoustic models are trained on standard databases, as well as the complexity of the resulting models and their adaptation to specific data. Our experimental database contains a suitable number of utterances and sustained speech from healthy (i.e control) and OSA Spanish speakers. Finally, a 25.1% relative reduction in classification error is achieved when fusing continuous and sustained speech classifiers. Index Terms: obstructive sleep apnea (OSA), gaussian mixture models (GMMs), background model (BM), classifier fusion.
Resumo:
A partir de un simulador de vocales denominado Vox, programado en MATLAB, desarrollado originalmente en la Universidad Técnica de Aquisgrán por Malte Kob [1] y mejorado en el Departamento de ICS de la EUITT [2], se pueden generar voces sintéticas. La principal limitación del simulador es que sólo puede generar vocales sintéticas, además la simulación se realiza a partir de parámetros anatómicos y fisiológicos fijos. La estructura actual del programa dificulta la modificación rápida de cualquiera de los parámetros básicos de la misma, circunstancia que podría mejorar mediante una interfaz gráfica. El proyecto consistirá, por un lado, en completar el simulador haciendo posible también la síntesis a partir de los parámetros anatómicos de hombres, mujeres y niños; y por otro, en el diseño e implementación de una interfaz gráfica de usuario que nos permita seleccionar los diferentes parámetros físicos para la simulación y recoger los resultados de la misma de manera más sencilla. Starting from a vowels simulator called Vox, programmed in MATLAB, originally developed in the Technical college of Aquisgrán by Malte Kob [1] and improved in the ICS Department of the EUITT [2], with this programme you can generate synthetic voices. The main limitation of the simulator is that it only can generate synthetic vowels; moreover the simulation is made from anatomical and physiological fixed parameters. The current structure of the programme complicates the quick modification of any of the basic parameters of it, circumstance that could be improved through a graphic interface. On the one hand, the project consists in completing the simulator doing the synthesis possible, from the anatomical woman, men and children parameters; on the other hand, the design and implementation of a graphic user interface, that allow us to select different physical parameters to the simulation and gather the results of it in a simple way.
Resumo:
Phonation distortion leaves relevant marks in a speaker's biometric profile. Dysphonic voice production may be used for biometrical speaker characterization. In the present paper phonation features derived from the glottal source (GS) parameterization, after vocal tract inversion, is proposed for dysphonic voice characterization in Speaker Verification tasks. The glottal source derived parameters are matched in a forensic evaluation framework defining a distance-based metric specification. The phonation segments used in the study are derived from fillers, long vowels, and other phonation segments produced in spontaneous telephone conversations. Phonated segments from a telephonic database of 100 male Spanish native speakers are combined in a 10-fold cross-validation task to produce the set of quality measurements outlined in the paper. Shimmer, mucosal wave correlate, vocal fold cover biomechanical parameter unbalance and a subset of the GS cepstral profile produce accuracy rates as high as 99.57 for a wide threshold interval (62.08-75.04%). An Equal Error Rate of 0.64 % can be granted. The proposed metric framework is shown to behave more fairly than classical likelihood ratios in supporting the hypothesis of the defense vs that of the prosecution, thus ofering a more reliable evaluation scoring. Possible applications are Speaker Verification and Dysphonic Voice Grading.