985 resultados para Perceptual speech analysis


Relevância:

100.00% 100.00%

Publicador:

Resumo:

The work presented here is part of a larger study to identify novel technologies and biomarkers for early Alzheimer disease (AD) detection and it focuses on evaluating the suitability of a new approach for early AD diagnosis by non-invasive methods. The purpose is to examine in a pilot study the potential of applying intelligent algorithms to speech features obtained from suspected patients in order to contribute to the improvement of diagnosis of AD and its degree of severity. In this sense, Artificial Neural Networks (ANN) have been used for the automatic classification of the two classes (AD and control subjects). Two human issues have been analyzed for feature selection: Spontaneous Speech and Emotional Response. Not only linear features but also non-linear ones, such as Fractal Dimension, have been explored. The approach is non invasive, low cost and without any side effects. Obtained experimental results were very satisfactory and promising for early diagnosis and classification of AD patients.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Alzheimer's disease is the most prevalent form of progressive degenerative dementia; it has a high socio-economic impact in Western countries. Therefore it is one of the most active research areas today. Alzheimer's is sometimes diagnosed by excluding other dementias, and definitive confirmation is only obtained through a post-mortem study of the brain tissue of the patient. The work presented here is part of a larger study that aims to identify novel technologies and biomarkers for early Alzheimer's disease detection, and it focuses on evaluating the suitability of a new approach for early diagnosis of Alzheimer’s disease by non-invasive methods. The purpose is to examine, in a pilot study, the potential of applying Machine Learning algorithms to speech features obtained from suspected Alzheimer sufferers in order help diagnose this disease and determine its degree of severity. Two human capabilities relevant in communication have been analyzed for feature selection: Spontaneous Speech and Emotional Response. The experimental results obtained were very satisfactory and promising for the early diagnosis and classification of Alzheimer’s disease patients.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Alzheimer’s disease (AD) is the most prevalent form of progressive degenerative dementia and it has a high socio-economic impact in Western countries, therefore is one of the most active research areas today. Its diagnosis is sometimes made by excluding other dementias, and definitive confirmation must be done trough a post-mortem study of the brain tissue of the patient. The purpose of this paper is to contribute to im-provement of early diagnosis of AD and its degree of severity, from an automatic analysis performed by non-invasive intelligent methods. The methods selected in this case are Automatic Spontaneous Speech Analysis (ASSA) and Emotional Temperature (ET), that have the great advantage of being non invasive, low cost and without any side effects.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Alzheimer’s disease (AD) is the most prevalent form of progressive degenerative dementia and it has a high socio-economic impact in Western countries, therefore is one of the most active research areas today. Its diagnosis is sometimes made by excluding other dementias, and definitive confirmation must be done trough a post-mortem study of the brain tissue of the patient. The purpose of this paper is to contribute to improvement of early diagnosis of AD and its degree of severity, from an automatic analysis performed by non-invasive intelligent methods. The methods selected in this case are Automatic Spontaneous Speech Analysis (ASSA) and Emotional Temperature (ET), that have the great advantage of being non invasive, low cost and without any side effects.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Medical fields requires fast, simple and noninvasive methods of diagnostic techniques. Several methods are available and possible because of the growth of technology that provides the necessary means of collecting and processing signals. The present thesis details the work done in the field of voice signals. New methods of analysis have been developed to understand the complexity of voice signals, such as nonlinear dynamics aiming at the exploration of voice signals dynamic nature. The purpose of this thesis is to characterize complexities of pathological voice from healthy signals and to differentiate stuttering signals from healthy signals. Efficiency of various acoustic as well as non linear time series methods are analysed. Three groups of samples are used, one from healthy individuals, subjects with vocal pathologies and stuttering subjects. Individual vowels/ and a continuous speech data for the utterance of the sentence "iruvarum changatimaranu" the meaning in English is "Both are good friends" from Malayalam language are recorded using a microphone . The recorded audio are converted to digital signals and are subjected to analysis.Acoustic perturbation methods like fundamental frequency (FO), jitter, shimmer, Zero Crossing Rate(ZCR) were carried out and non linear measures like maximum lyapunov exponent(Lamda max), correlation dimension (D2), Kolmogorov exponent(K2), and a new measure of entropy viz., Permutation entropy (PE) are evaluated for all three groups of the subjects. Permutation Entropy is a nonlinear complexity measure which can efficiently distinguish regular and complex nature of any signal and extract information about the change in dynamics of the process by indicating sudden change in its value. The results shows that nonlinear dynamical methods seem to be a suitable technique for voice signal analysis, due to the chaotic component of the human voice. Permutation entropy is well suited due to its sensitivity to uncertainties, since the pathologies are characterized by an increase in the signal complexity and unpredictability. Pathological groups have higher entropy values compared to the normal group. The stuttering signals have lower entropy values compared to the normal signals.PE is effective in charaterising the level of improvement after two weeks of speech therapy in the case of stuttering subjects. PE is also effective in characterizing the dynamical difference between healthy and pathological subjects. This suggests that PE can improve and complement the recent voice analysis methods available for clinicians. The work establishes the application of the simple, inexpensive and fast algorithm of PE for diagnosis in vocal disorders and stuttering subjects.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Although in Europe and in the USA many studies focus on organic, little is known on the topic in China. This research provides an insight on Shanghai consumers’ perception of organic, aiming at understanding and representing in graphic form the network of mental associations that stems from the organic concept. To acquire, process and aggregate the individual networks it was used the “Brand concept mapping” methodology (Roedder et al., 2006), while the data analysis was carried out also using analytic procedures. The results achieved suggest that organic food is perceived as healthy, safe and costly. Although these attributes are pretty much consistent with the European perception, some relevant differences emerged. First, organic is not necessarily synonymous with natural product in China, also due to a poor translation of the term in the Chinese language that conveys the idea of a manufactured product. Secondly, the organic label has to deal with the competition with the green food label in terms of image and positioning on the market, since they are easily associated and often confused. “Environmental protection” also emerged as relevant association, while the ethical and social values were not mentioned. In conclusion, health care and security concerns are the factors that influence most the food consumption in China (many people are so concerned about food safety that they found it difficult to shop), and the associations “Safe”, “Pure and natural”, “without chemicals” and “healthy” have been identified as the best candidates for leveraging a sound image of organic food .

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This thesis describes work undertaken in order to fulfil a need experienced in the Department of Educational Enquiry at the University of Aston in Birmingham for speech analysis facilities suitable for use in teaching and research work within the Department. The hardware and software developed during the research project provides displays of speech fundamental frequency and intensity in real time. The system is suitable for the provision of visual feedback of these parameters of a subject's speech in a learning situation, and overcomes the inadequacies of equipment currently used for this task in that it provides a clear indication of fundamental frequency contours as the subject is speaking. The thesis considers the use of such equipment in several related fields, and the approaches that have been reported to one of the major problems of speech analysis, namely pitch-period estimation. A number of different systems are described, and their suitability for the present purposes is discussed. Finally, a novel method of pitch-period estimation is developed, and a speech analysis system incorporating this method is described. Comparison is made between the results produced by this system and those produced by a conventional speech spectrograph.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The standard reference clinical score quantifying average Parkinson's disease (PD) symptom severity is the Unified Parkinson's Disease Rating Scale (UPDRS). At present, UPDRS is determined by the subjective clinical evaluation of the patient's ability to adequately cope with a range of tasks. In this study, we extend recent findings that UPDRS can be objectively assessed to clinically useful accuracy using simple, self-administered speech tests, without requiring the patient's physical presence in the clinic. We apply a wide range of known speech signal processing algorithms to a large database (approx. 6000 recordings from 42 PD patients, recruited to a six-month, multi-centre trial) and propose a number of novel, nonlinear signal processing algorithms which reveal pathological characteristics in PD more accurately than existing approaches. Robust feature selection algorithms select the optimal subset of these algorithms, which is fed into non-parametric regression and classification algorithms, mapping the signal processing algorithm outputs to UPDRS. We demonstrate rapid, accurate replication of the UPDRS assessment with clinically useful accuracy (about 2 UPDRS points difference from the clinicians' estimates, p < 0.001). This study supports the viability of frequent, remote, cost-effective, objective, accurate UPDRS telemonitoring based on self-administered speech tests. This technology could facilitate large-scale clinical trials into novel PD treatments.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In the last few years, the number of systems and devices that use voice based interaction has grown significantly. For a continued use of these systems, the interface must be reliable and pleasant in order to provide an optimal user experience. However there are currently very few studies that try to evaluate how pleasant is a voice from a perceptual point of view when the final application is a speech based interface. In this paper we present an objective definition for voice pleasantness based on the composition of a representative feature subset and a new automatic voice pleasantness classification and intensity estimation system. Our study is based on a database composed by European Portuguese female voices but the methodology can be extended to male voices or to other languages. In the objective performance evaluation the system achieved a 9.1% error rate for voice pleasantness classification and a 15.7% error rate for voice pleasantness intensity estimation.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Perceptual voice analysis is a subjective process. However, despite reports of varying degrees of intrajudge and interjudge reliability, it is widely used in clinical voice evaluation. One of the ways to improve the reliability of this procedure is to provide judges with signals as external standards so that comparison can be made in relation to these anchor signals. The present study used a Klatt speech synthesizer to create a set of speech signals with varying degree of three different voice qualities based on a Cantonese sentence. The primary objective of the study was to determine whether different abnormal voice qualities could be synthesized using the built-in synthesis parameters using a perceptual study. The second objective was to determine the relationship between acoustic characteristics of the synthesized signals and perceptual judgment. Twenty Cantonese-speaking speech pathologists with at least three years of clinical experience in perceptual voice evaluation were asked to undertake two tasks. The first was to decide whether the voice quality of the synthesized signals was normal or not. The second was to decide whether the abnormal signals should be described as rough, breathy, or vocal fry. The results showed that signals generated with a small degree of aspiration noise were perceived as breathiness while signals with a small degree of flutter or double pulsing were perceived as roughness. When the flutter or double pulsing increased further, tremor and vocal fry, rather than roughness, were perceived. Furthermore, the amount of aspiration noise, flutter, or double pulsing required for male voice stimuli was different from that required for the female voice stimuli with a similar level of perceptual breathiness and roughness. These findings showed that changes in perceived vocal quality could be achieved by systematic modifications of synthesis parameters. This opens up the possibility of using synthesized voice signals as external standards or anchors to improve the reliability of clinical perceptual voice evaluation. (C) 2002 Acoustical Society of America.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The speech characteristics, oromotor function and speech intelligibility of a group of children treated for cerebellar tumour (CT) was investigated perceptually. Assessment of these areas was performed on 11 children treated for CT with dysarthric speech as well as 21 non-neurologically impaired controls matched for age and sex to obtain a comprehensive perceptual profile of their speech and oromotor mechanism. Contributing to the perception of dysarthria were a number of deviant speech dimensions including imprecision of consonants, hoarseness and decreased pitch variation, as well as a reduction in overall speech intelligibility for both sentences and connected speech. Oromotor assessment revealed deficits in lip, tongue and laryngeal function, particularly relating to deficits in timing and coordination of movements. The most salient features of the dysarthria seen in children treated for CT were the mild nature of the speech disorder and clustering of speech deficits in the prosodic, phonatory and articulatory aspects of speech production.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The canonical representation of speech constitutes a perfect reconstruction (PR) analysis-synthesis system. Its parameters are the autoregressive (AR) model coefficients, the pitch period and the voiced and unvoiced components of the excitation represented as transform coefficients. Each set of parameters may be operated on independently. A time-frequency unvoiced excitation (TFUNEX) model is proposed that has high time resolution and selective frequency resolution. Improved time-frequency fit is obtained by using for antialiasing cancellation the clustering of pitch-synchronous transform tracks defined in the modulation transform domain. The TFUNEX model delivers high-quality speech while compressing the unvoiced excitation representation about 13 times over its raw transform coefficient representation for wideband speech.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The metallic voice is usually confused with ring or nasality by singers and nontrained listeners. who are not used to perceptual vocal analysis. They believe a metallic voice results from a rise in fundamental frequency. A diagnostic error in this aspect may lead to lowering pitch, an incorrect procedure that Could Cause vocal overload and fatigue. The purpose of this article is to Study the quality of metallic voice considering the correlation between information of the physiological and acoustic plans, based on a perceptive consensual assumption. Fiberscopic video pharyngolaryngoscopy was performed on 21 professional singers while speaking vowel [e]-in normal and metallic modes to observe muscular movements and structural changes of the velopharynx, pharynx, and larynx. Vocal samples captured simultaneously to the fiberscopic examination were acoustically analyzed. Frequency and amplitude of the first four formants (F(1), F(2), F(3), and F(4)) were extracted by means of linear predictor coefficients (LPC) Spectrum and were statistically analyzed. Vocal tract adjustments such as velar lowering, pharyngeal wall narrowing, laryngeal rise, aryepiglottic, and lateral laryngeal constrictions were frequently found: there were no significant changes in frequency and amplitude of F(1) in the metallic voiced there were significant increases in amplitudes of F(2), F(3), and F(4) and in frequency for F, metallic Voice perceived as louder was correlated to an increase ill amplitude of F(3) and F(4). Physiological adjustments of velopharynx, pharynx, and larynx are combined in characterizing the metallic voice and can be acoustically related to changes in formant pattern.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Objective: To evaluate the maximum residual signal auto-correlation also known as pitch amplitude (PA) values in patients with Parkinson's disease (PD) patients. Method. The signals of 21 Parkinson's patients were compared with 15 healthy individuals, divided according age and gender. Results: Statistical difference was seen between groups for PA, 0.39 for controls and 0.25 for PD. Normal value threshold was set as 0.3; (p <= 0.001). In the Parkinson's group 80.77%, and in the control group only 12.28%, had a PA < 0.3 demonstrating an association between these variables. The dispersion diagram for age and PA for PD individuals showed p=0.01 and r=0.54. There was no significant difference in relation to gender and PA between groups: Conclusion: the significant differences in pitch's amplitude between PD patients and healthy individuals demonstrate the methods specificity.-The results showed the need of prospective controlled studies,to improve the use and indications of residual signal auto-correlation to evaluate speech in PD patients.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

BACKGROUND: One of the great difficulties in evaluating a voice is the judgment of quality through the perceptual auditive analysis--although frequently used--, as it is influenced by socioeconomic and cultural aspects as well as individual preferences. Many are the adjectives and methods used in this assessment, especially because of the subjectivity involved in the process, leading to incompatibilities between listeners and difficulties in reaching a consensus on the use of this or that terminology. In such a context, the voice laboratory and more specifically the acoustic computerized analysis, has guided and complemented speech-language treatments. Among the several possibilities of spectrographic analysis, the (Long-Term Average Spectrum--LTAS) quantifies the quality of voices, pointing differences between gender, age, professional--spoken and sang--and dysphonic voices. The LTAS has been used a lot in researches that investigate voice. As it evidences the contribution of the glottic source and of resonance to the quality of voice, it provides objective parameters for the evaluation of this aspect which usually depends on our auditive perception. AIM: to demonstrate how LTAS can be applied in voice research and in the speech-language therapy practice, describing both the technical aspects required for the production and interpretation of results, and its limitations. CONCLUSION: The area of voice research has developed a lot in these last two decades especially because of the advent of the voice and speech laboratory. For this reason, the knowledge about the applicability of more tools for voice analysis, as the LTAS, as well as the existing need for more studies in this area, will most certainly contribute for the creation of new research areas not only in the field of professional voice but also in the field of therapy.