1000 resultados para Processament de la parla
Resumo:
This special issue aims to cover some problems related to non-linear and nonconventional speech processing. The origin of this volume is in the ISCA Tutorial and Research Workshop on Non-Linear Speech Processing, NOLISP’09, held at the Universitat de Vic (Catalonia, Spain) on June 25–27, 2009. The series of NOLISP workshops started in 2003 has become a biannual event whose aim is to discuss alternative techniques for speech processing that, in a sense, do not fit into mainstream approaches. A selected choice of papers based on the presentations delivered at NOLISP’09 has given rise to this issue of Cognitive Computation.
Resumo:
The work presented here is part of a larger study to identify novel technologies and biomarkers for early Alzheimer disease (AD) detection and it focuses on evaluating the suitability of a new approach for early AD diagnosis by non-invasive methods. The purpose is to examine in a pilot study the potential of applying intelligent algorithms to speech features obtained from suspected patients in order to contribute to the improvement of diagnosis of AD and its degree of severity. In this sense, Artificial Neural Networks (ANN) have been used for the automatic classification of the two classes (AD and control subjects). Two human issues have been analyzed for feature selection: Spontaneous Speech and Emotional Response. Not only linear features but also non-linear ones, such as Fractal Dimension, have been explored. The approach is non invasive, low cost and without any side effects. Obtained experimental results were very satisfactory and promising for early diagnosis and classification of AD patients.
Resumo:
Alzheimer's disease is the most prevalent form of progressive degenerative dementia; it has a high socio-economic impact in Western countries. Therefore it is one of the most active research areas today. Alzheimer's is sometimes diagnosed by excluding other dementias, and definitive confirmation is only obtained through a post-mortem study of the brain tissue of the patient. The work presented here is part of a larger study that aims to identify novel technologies and biomarkers for early Alzheimer's disease detection, and it focuses on evaluating the suitability of a new approach for early diagnosis of Alzheimer’s disease by non-invasive methods. The purpose is to examine, in a pilot study, the potential of applying Machine Learning algorithms to speech features obtained from suspected Alzheimer sufferers in order help diagnose this disease and determine its degree of severity. Two human capabilities relevant in communication have been analyzed for feature selection: Spontaneous Speech and Emotional Response. The experimental results obtained were very satisfactory and promising for the early diagnosis and classification of Alzheimer’s disease patients.
Resumo:
The prediction filters are well known models for signal estimation, in communications, control and many others areas. The classical method for deriving linear prediction coding (LPC) filters is often based on the minimization of a mean square error (MSE). Consequently, second order statistics are only required, but the estimation is only optimal if the residue is independent and identically distributed (iid) Gaussian. In this paper, we derive the ML estimate of the prediction filter. Relationships with robust estimation of auto-regressive (AR) processes, with blind deconvolution and with source separation based on mutual information minimization are then detailed. The algorithm, based on the minimization of a high-order statistics criterion, uses on-line estimation of the residue statistics. Experimental results emphasize on the interest of this approach.
Resumo:
The linear prediction coding of speech is based in the assumption that the generation model is autoregresive. In this paper we propose a structure to cope with the nonlinear effects presents in the generation of the speech signal. This structure will consist of two stages, the first one will be a classical linear prediction filter, and the second one will model the residual signal by means of two nonlinearities between a linear filter. The coefficients of this filter are computed by means of a gradient search on the score function. This is done in order to deal with the fact that the probability distribution of the residual signal still is not gaussian. This fact is taken into account when the coefficients are computed by a ML estimate. The algorithm based on the minimization of a high-order statistics criterion, uses on-line estimation of the residue statistics and is based on blind deconvolution of Wiener systems [1]. Improvements in the experimental results with speech signals emphasize on the interest of this approach.
Resumo:
Alzheimer’s disease (AD) is the most prevalent form of progressive degenerative dementia and it has a high socio-economic impact in Western countries, therefore is one of the most active research areas today. Its diagnosis is sometimes made by excluding other dementias, and definitive confirmation must be done trough a post-mortem study of the brain tissue of the patient. The purpose of this paper is to contribute to improvement of early diagnosis of AD and its degree of severity, from an automatic analysis performed by non-invasive intelligent methods. The methods selected in this case are Automatic Spontaneous Speech Analysis (ASSA) and Emotional Temperature (ET), that have the great advantage of being non invasive, low cost and without any side effects.
Resumo:
This paper analyzes applications of cumulant analysis in speech processing. A special focus is made on different second-order statistics. A dominant role is played by an integral representation for cumulants by means of integrals involving cyclic products of kernels.
Resumo:
In this paper we explore the use of non-linear transformations in order to improve the performance of an entropy based voice activity detector (VAD). The idea of using a non-linear transformation comes from some previous work done in speech linear prediction (LPC) field based in source separation techniques, where the score function was added into the classical equations in order to take into account the real distribution of the signal. We explore the possibility of estimating the entropy of frames after calculating its score function, instead of using original frames. We observe that if signal is clean, estimated entropy is essentially the same; but if signal is noisy transformed frames (with score function) are able to give different entropy if the frame is voiced against unvoiced ones. Experimental results show that this fact permits to detect voice activity under high noise, where simple entropy method fails.
Resumo:
The purpose of our project is to contribute to earlier diagnosis of AD and better estimates of its severity by using automatic analysis performed through new biomarkers extracted from non-invasive intelligent methods. The methods selected in this case are speech biomarkers oriented to Sponta-neous Speech and Emotional Response Analysis. Thus the main goal of the present work is feature search in Spontaneous Speech oriented to pre-clinical evaluation for the definition of test for AD diagnosis by One-class classifier. One-class classifi-cation problem differs from multi-class classifier in one essen-tial aspect. In one-class classification it is assumed that only information of one of the classes, the target class, is available. In this work we explore the problem of imbalanced datasets that is particularly crucial in applications where the goal is to maximize recognition of the minority class as in medical diag-nosis. The use of information about outlier and Fractal Dimen-sion features improves the system performance.
Resumo:
In this paper we present experimental results comparing on-line drawings for control population (left and right hand) as well as Alzheimer disease patients. The drawings have been acquired by means of a digitizing tablet, which acquires time information angles and pressures. Experimental measures based on pressure and in-air movements appear to be significantly different for both groups, even when control population performs the tasks with the non-dominant hand.
Resumo:
El Nadal és una època plena de simbolisme, on les converses amb familiars i amics són protagonistes. Hi ha una frase bíblica,"I el Verb es va fer home i va conviure amb nosaltres" (Joan1,1-14), que sempre m'hi fa pensar, atès que en un context religiós el mot verb pot significar precisament això, "expressió d'idees i pensaments mitjançant la paraula" (DIEC2). Actualment al nostre planeta es parlen més de 6.800 idiomes. Hi ha idiomes tonals, com el mandarí i el ioruba, on el to amb què es pronuncia una paraula afecta el seu significat [...].
Resumo:
Dins del marc del projecte europeu HERMES, al Centre de Visió per Computador de la Universitat Autònoma de Barcelona s'està desenvolupant un agent conversacional animat per ordinador el qual haurà de ser capaç d'interactuar amb l'usuari a través de diferents canals de forma simultània, o, el que és el mateix, parlar, gesticular, expressar emocions... Partint, doncs, d'un software capaç de fer que un model 3D d'un cap humà expressi emocions i parli en anglès, donat un arxiu d'àudio prèviament generat, en el treball que aquí es presenta es duu a terme la recerca d'una eina sintetitzadora de parla a partir de text que permeti fer això mateix en català. En aquest document s'explica el procés seguit per a trobar aquesta eina, la investigació realitzada sobre el funcionament d'ambdues per tal d'entendre-les i poder-hi treballar, així com, finalment, les modificacions realitzades per a fer que aquestes puguin interactuar i generar parla inteligible en català a partir de textos escrits en aquest idioma.
Resumo:
Con la mayor capacidad de los nodos de procesamiento en relación a la potencia de cómputo, cada vez más aplicaciones intensivas de datos como las aplicaciones de la bioinformática, se llevarán a ejecutar en clusters no dedicados. Los clusters no dedicados se caracterizan por su capacidad de combinar la ejecución de aplicaciones de usuarios locales con aplicaciones, científicas o comerciales, ejecutadas en paralelo. Saber qué efecto las aplicaciones con acceso intensivo a dados producen respecto a la mezcla de otro tipo (batch, interativa, SRT, etc) en los entornos no-dedicados permite el desarrollo de políticas de planificación más eficientes. Algunas de las aplicaciones intensivas de E/S se basan en el paradigma MapReduce donde los entornos que las utilizan, como Hadoop, se ocupan de la localidad de los datos, balanceo de carga de forma automática y trabajan con sistemas de archivos distribuidos. El rendimiento de Hadoop se puede mejorar sin aumentar los costos de hardware, al sintonizar varios parámetros de configuración claves para las especificaciones del cluster, para el tamaño de los datos de entrada y para el procesamiento complejo. La sincronización de estos parámetros de sincronización puede ser demasiado compleja para el usuario y/o administrador pero procura garantizar prestaciones más adecuadas. Este trabajo propone la evaluación del impacto de las aplicaciones intensivas de E/S en la planificación de trabajos en clusters no-dedicados bajo los paradigmas MPI y Mapreduce.
Resumo:
Treballar amb el pacient laringuectomitzat per ajudar-lo a recuperar la facultat de la parla després de la laringuectomia total i així, normalitzar i intentar salvar el gran canvi bio-psico-social que suposa aquesta intervenció quirúrgica. Analitzar i comparar els resultats de la qualitat de la veu obtinguda mitjançant els diferents mètodes de rehabilitació de la veu: fistuloplàstia (principalment) i erigmofonia; així com obtenir dades objectives de factors de rellevància important en el pacient oncològic.
Resumo:
El objetivo principal de este proyecto es la caracterización de la microcuenca la Jabonera (Estelí, Nicaragua) enfatizando el agua como factor clave que conecta todos los elementos que interaccionan en la microcuenca y que, además delimita el área de estudio. El trabajo de campo ha consistido básicamente en la georeferenciación de los puntos de interés, la realización de encuestas a la población y la evaluación de las fuentes de agua y del agua del río mediante análisis fisicoquímicos. En el procesamiento de la información se ha elaborado cartografía temática mediante la herramienta SIG que ha servido de soporte para la interpretación de los resultados. Las características morfométricas y biofísicas favorecen que el agua precipitada se pierda rápidamente por escorrentía superficial con una tendencia moderada a crecidas e inundaciones. El agua infiltrada circula rápidamente por fracturas del material geológico con tiempos de tránsito cortos, y además, el área de recarga de los nacientes es local por lo que las fuentes son especialmente vulnerables a períodos de sequía y a la contaminación en su entorno cercano. El estudio de usos del suelo junto con la realización de análisis del agua ha permitido determinar que los agroquímicos son la principal fuente potencial de contaminación del agua en la microcuenca. Los resultados obtenidos muestran la necesidad de llevar a cabo una gestión integrada del territorio que garantice un desarrollo socioambiental sostenible.