6 resultados para Voice quality
em Universidad Politécnica de Madrid
Resumo:
A case study of vocal fold paralysis treatment is described with the help of the voice quality analysis application BioMet®Phon. The case corresponds to a description of a 40 - year old female patient who was diagnosed of vocal fold paralysis following a cardio - pulmonar intervention which required intubation for 8 days and posterior tracheotomy for 15 days. The patient presented breathy and asthenic phon ation, and dysphagia. Six main examinations were conducted during a full year period that the treatment lasted consisting in periodic reviews including video - endostroboscopy, voice analysis and breathing function monitoring. The phoniatrician treatment inc luded 20 sessions of vocal rehabilitation, followed by an intracordal infiltration with Radiesse 8 months after the rehabilitation treatment started followed by 6 sessions of rehabilitation more. The videondoscopy and the voicing quality analysis refer a s ubstantial improvement in the vocal function with recovery in all the measures estimated (jitter, shimmer, mucosal wave contents, glottal closure, harmonic contents and biomechanical function analysis). The paper refers the procedure followed and the results obtained by comparing the longitudinal progression of the treatment, illustrating the utility of voice quality analysis tools in speech therapy.
Resumo:
BioMet®Phon is a software application developed for the characterization of voice in voice quality evaluation. Initially it was conceived as plain research code to estimate the glottal source from voice and obtain the biomechanical parameters of the vocal folds from the spectral density of the estimate. This code grew to what is now the Glottex®Engine package (G®E). Further demands from users in laryngology and speech therapy fields instantiated the development of a specific Graphic User Interface (GUI’s) to encapsulate user interaction with the G®E. This gave place to BioMet®Phon, an application which extracts the glottal source from voice and offers a complete parameterization of this signal, including distortion, cepstral, spectral, biomechanical, time domain, contact and tremor parameters. The semantic capabilities of biomechanical parameters are discussed. Study cases from its application to the field of laryngology and speech therapy are given and discussed. Validation results in voice pathology detection are also presented. Applications to laryngology, speech therapy, and monitoring neurological deterioration in the elder are proposed.
Resumo:
BioMet®Tools is a set of software applications developed for the biometrical characterization of voice in different fields as voice quality evaluation in laryngology, speech therapy and rehabilitation, education of the singing voice, forensic voice analysis in court, emotional detection in voice, secure access to facilities and services, etc. Initially it was conceived as plain research code to estimate the glottal source from voice and obtain the biomechanical parameters of the vocal folds from the spectral density of the estimate. This code grew to what is now the Glottex®Engine package (G®E). Further demands from users in medical and forensic fields instantiated the development of different Graphic User Interfaces (GUI’s) to encapsulate user interaction with the G®E. This required the personalized design of different GUI’s handling the same G®E. In this way development costs and time could be saved. The development model is described in detail leading to commercial production and distribution. Study cases from its application to the field of laryngology and speech therapy are given and discussed.
Resumo:
Las patologías de la voz se han transformado en los últimos tiempos en una problemática social con cierto calado. La contaminación de las ciudades, hábitos como el de fumar, el uso de aparatos de aire acondicionado, etcétera, contribuyen a ello. Esto alcanza más relevancia en profesionales que utilizan su voz de manera frecuente, como, por ejemplo, locutores, cantantes, profesores o teleoperadores. Por todo ello resultan de especial interés las técnicas de ayuda al diagnóstico que son capaces de extraer conclusiones clínicas a partir de una muestra de la voz grabada con un micrófono, frente a otras invasivas que implican la exploración utilizando laringoscopios, fibroscopios o videoendoscopios, técnicas en cualquier caso mucho más molestas para los pacientes al exigir la introducción parcial del instrumental citado por la garganta, en actuaciones consideradas de tipo quirúrgico. Dentro de aquellas técnicas se ha avanzado mucho en un período de tiempo relativamente corto. En lo que se refiere al diagnóstico de patologías, hemos pasado en los últimos quince años de trabajar principalmente con parámetros extraídos de la señal de voz –tanto en el dominio del tiempo como en el de la frecuencia– y con escalas elaboradas con valoraciones subjetivas realizadas por expertos a hacerlo también con parámetros procedentes de estimaciones de la fuente glótica. La importancia de utilizar la fuente glótica reside, a grandes rasgos, en que se trata de una señal vinculada directamente al estado de la estructura laríngea del locutor y también en que está generalmente menos influida por el tracto vocal que la señal de voz. Es conocido que el tracto vocal guarda más relación con el mensaje hablado, y su presencia dificulta el proceso de detección de patología vocal. Estas estimaciones de la fuente glótica han sido obtenidas a través de técnicas de filtrado inverso desarrolladas por nuestro grupo de investigación. Hemos conseguido, además, profundizar en la naturaleza de la señal glótica: somos capaces de descomponerla y relacionarla con parámetros biomecánicos de los propios pliegues vocales, obteniendo estimaciones de elementos como la masa, la pérdida de energía o la elasticidad del cuerpo y de la cubierta del pliegue, entre otros. De las componentes de la fuente glótica surgen también los denominados parámetros biométricos, relacionados con la forma de la señal, que constituyen por sí mismos una firma biométrica del individuo. También trabajaremos con parámetros temporales, relacionados con las diferentes etapas que se observan dentro de la señal glótica durante un ciclo de fonación. Por último, consideraremos parámetros clásicos de perturbación y energía de la señal. En definitiva, contamos ahora con una considerable cantidad de parámetros glóticos que conforman una base estadística multidimensional, destinada a ser capaz de discriminar personas con voces patológicas o disfónicas de aquellas que no presentan patología en la voz o con voces sanas o normofónicas. Esta tesis doctoral se ocupa de varias cuestiones: en primer lugar, es necesario analizar cuidadosamente estos nuevos parámetros, por lo que ofreceremos una completa descripción estadística de los mismos. También estudiaremos cuestiones como la distribución de los parámetros atendiendo a criterios como el de normalidad estadística de los mismos, ocupándonos especialmente de la diferencia entre las distribuciones que presentan sujetos sanos y sujetos con patología vocal. Para todo ello emplearemos diferentes técnicas estadísticas: generación de elementos y diagramas descriptivos, pruebas de normalidad y diversos contrastes de hipótesis, tanto paramétricos como no paramétricos, que considerarán la diferencia entre los grupos de personas sanas y los grupos de personas con alguna patología relacionada con la voz. Además, nos interesa encontrar relaciones estadísticas entre los parámetros, de cara a eliminar posibles redundancias presentes en el modelo, a reducir la dimensionalidad del problema y a establecer un criterio de importancia relativa en los parámetros en cuanto a su capacidad discriminante para el criterio patológico/sano. Para ello se aplicarán técnicas estadísticas como la Correlación Lineal Bivariada y el Análisis Factorial basado en Componentes Principales. Por último, utilizaremos la conocida técnica de clasificación Análisis Discriminante, aplicada a diferentes combinaciones de parámetros y de factores, para determinar cuáles de ellas son las que ofrecen tasas de acierto más prometedoras. Para llevar a cabo la experimentación se ha utilizado una base de datos equilibrada y robusta formada por doscientos sujetos, cien de ellos pertenecientes al género femenino y los restantes cien al género masculino, con una proporción también equilibrada entre los sujetos que presentan patología vocal y aquellos que no la presentan. Una de las aplicaciones informáticas diseñada para llevar a cabo la recogida de muestras también es presentada en esta tesis. Los distintos estudios estadísticos realizados nos permitirán identificar aquellos parámetros que tienen una mayor contribución a la hora de detectar la presencia de patología vocal. Alguno de los estudios, además, nos permitirá presentar una ordenación de los parámetros en base a su importancia para realizar la detección. Por otra parte, también concluiremos que en ocasiones es conveniente realizar una reducción de la dimensionalidad de los parámetros para mejorar las tasas de detección. Por fin, las propias tasas de detección constituyen quizá la conclusión más importante del trabajo. Todos los análisis presentes en el trabajo serán realizados para cada uno de los dos géneros, de acuerdo con diversos estudios previos que demuestran que los géneros masculino y femenino deben tratarse de forma independiente debido a las diferencias orgánicas observadas entre ambos. Sin embargo, en lo referente a la detección de patología vocal contemplaremos también la posibilidad de trabajar con la base de datos unificada, comprobando que las tasas de acierto son también elevadas. Abstract Voice pathologies have become recently in a social problem that has reached a certain concern. Pollution in cities, smoking habits, air conditioning, etc. contributes to it. This problem is more relevant for professionals who use their voice frequently: speakers, singers, teachers, actors, telemarketers, etc. Therefore techniques that are capable of drawing conclusions from a sample of the recorded voice are of particular interest for the diagnosis as opposed to other invasive ones, involving exploration by laryngoscopes, fiber scopes or video endoscopes, which are techniques much less comfortable for patients. Voice quality analysis has come a long way in a relatively short period of time. In regard to the diagnosis of diseases, we have gone in the last fifteen years from working primarily with parameters extracted from the voice signal (both in time and frequency domains) and with scales drawn from subjective assessments by experts to produce more accurate evaluations with estimates derived from the glottal source. The importance of using the glottal source resides broadly in that this signal is linked to the state of the speaker's laryngeal structure. Unlike the voice signal (phonated speech) the glottal source, if conveniently reconstructed using adaptive lattices, may be less influenced by the vocal tract. As it is well known the vocal tract is related to the articulation of the spoken message and its influence complicates the process of voice pathology detection, unlike when using the reconstructed glottal source, where vocal tract influence has been almost completely removed. The estimates of the glottal source have been obtained through inverse filtering techniques developed by our research group. We have also deepened into the nature of the glottal signal, dissecting it and relating it to the biomechanical parameters of the vocal folds, obtaining several estimates of items such as mass, loss or elasticity of cover and body of the vocal fold, among others. From the components of the glottal source also arise the so-called biometric parameters, related to the shape of the signal, which are themselves a biometric signature of the individual. We will also work with temporal parameters related to the different stages that are observed in the glottal signal during a cycle of phonation. Finally, we will take into consideration classical perturbation and energy parameters. In short, we have now a considerable amount of glottal parameters in a multidimensional statistical basis, designed to be able to discriminate people with pathologic or dysphonic voices from those who do not show pathology. This thesis addresses several issues: first, a careful analysis of these new parameters is required, so we will offer a complete statistical description of them. We will also discuss issues such as distribution of the parameters, considering criteria such as their statistical normality. We will take special care in the analysis of the difference between distributions from healthy subjects and the distributions from pathological subjects. To reach these goals we will use different statistical techniques such as: generation of descriptive items and diagramas, tests for normality and hypothesis testing, both parametric and nonparametric. These latter techniques consider the difference between the groups of healthy subjects and groups of people with an illness related to voice. In addition, we are interested in finding statistical relationships between parameters. There are various reasons behind that: eliminate possible redundancies in the model, reduce the dimensionality of the problem and establish a criterion of relative importance in the parameters. The latter reason will be done in terms of discriminatory power for the criterion pathological/healthy. To this end, statistical techniques such as Bivariate Linear Correlation and Factor Analysis based on Principal Components will be applied. Finally, we will use the well-known technique of Discriminant Analysis classification applied to different combinations of parameters and factors to determine which of these combinations offers more promising success rates. To perform the experiments we have used a balanced and robust database, consisting of two hundred speakers, one hundred of them males and one hundred females. We have also used a well-balanced proportion where subjects with vocal pathology as well as subjects who don´t have a vocal pathology are equally represented. A computer application designed to carry out the collection of samples is also presented in this thesis. The different statistical analyses performed will allow us to determine which parameters contribute in a more decisive way in the detection of vocal pathology. Therefore, some of the analyses will even allow us to present a ranking of the parameters based on their importance for the detection of vocal pathology. On the other hand, we will also conclude that it is sometimes desirable to perform a dimensionality reduction in order to improve the detection rates. Finally, detection rates themselves are perhaps the most important conclusion of the work. All the analyses presented in this work have been performed for each of the two genders in agreement with previous studies showing that male and female genders should be treated independently, due to the observed functional differences between them. However, with regard to the detection of vocal pathology we will consider the possibility of working with the unified database, ensuring that the success rates obtained are also high.
Resumo:
The dramatic impact of neurological degenerative pathologies in life quality is a growing concern. It is well known that many neurological diseases leave a fingerprint in voice and speech production. Many techniques have been designed for the detection, diagnose and monitoring the neurological disease. Most of them are costly or difficult to extend to primary attention medical services. Through the present paper it will be shown how some neurological diseases can be traced at the level of phonation. The detection procedure would be based on a simple voice test. The availability of advanced tools and methodologies to monitor the organic pathology of voice would facilitate the implantation of these tests. The paper hypothesizes that some of the underlying mechanisms affecting the production of voice produce measurable correlates in vocal fold biomechanics. A general description of the methodological foundations for the voice analysis system which can estimate correlates to the neurological disease is shown. Some study cases will be presented to illustrate the possibilities of the methodology to monitor neurological diseases by voice
Resumo:
Phonation distortion leaves relevant marks in a speaker's biometric profile. Dysphonic voice production may be used for biometrical speaker characterization. In the present paper phonation features derived from the glottal source (GS) parameterization, after vocal tract inversion, is proposed for dysphonic voice characterization in Speaker Verification tasks. The glottal source derived parameters are matched in a forensic evaluation framework defining a distance-based metric specification. The phonation segments used in the study are derived from fillers, long vowels, and other phonation segments produced in spontaneous telephone conversations. Phonated segments from a telephonic database of 100 male Spanish native speakers are combined in a 10-fold cross-validation task to produce the set of quality measurements outlined in the paper. Shimmer, mucosal wave correlate, vocal fold cover biomechanical parameter unbalance and a subset of the GS cepstral profile produce accuracy rates as high as 99.57 for a wide threshold interval (62.08-75.04%). An Equal Error Rate of 0.64 % can be granted. The proposed metric framework is shown to behave more fairly than classical likelihood ratios in supporting the hypothesis of the defense vs that of the prosecution, thus ofering a more reliable evaluation scoring. Possible applications are Speaker Verification and Dysphonic Voice Grading.