936 resultados para Speech analisys
Resumo:
The last 2 years have seen exciting advances in the genetics of Landau-Kleffner syndrome and related disorders, encompassed within the epilepsy-aphasia spectrum (EAS). The striking finding of mutations in the N-methyl-D-aspartate (NMDA) receptor subunit gene GRIN2A as the first monogenic cause in up to 20 % of patients with EAS suggests that excitatory glutamate receptors play a key role in these disorders. Patients with GRIN2A mutations have a recognizable speech and language phenotype that may assist with diagnosis. Other molecules involved in RNA binding and cell adhesion have been implicated in EAS; copy number variations are also found. The emerging picture highlights the overlap between the genetic determinants of EAS with speech and language disorders, intellectual disability, autism spectrum disorders and more complex developmental phenotypes.
Resumo:
Alzheimer’s disease (AD) is the most prevalent form of progressive degenerative dementia and it has a high socio-economic impact in Western countries, therefore is one of the most active research areas today. Its diagnosis is sometimes made by excluding other dementias, and definitive confirmation must be done trough a post-mortem study of the brain tissue of the patient. The purpose of this paper is to contribute to improvement of early diagnosis of AD and its degree of severity, from an automatic analysis performed by non-invasive intelligent methods. The methods selected in this case are Automatic Spontaneous Speech Analysis (ASSA) and Emotional Temperature (ET), that have the great advantage of being non invasive, low cost and without any side effects.
Resumo:
This paper analyzes applications of cumulant analysis in speech processing. A special focus is made on different second-order statistics. A dominant role is played by an integral representation for cumulants by means of integrals involving cyclic products of kernels.
Resumo:
The purpose of our project is to contribute to earlier diagnosis of AD and better estimates of its severity by using automatic analysis performed through new biomarkers extracted from non-invasive intelligent methods. The methods selected in this case are speech biomarkers oriented to Sponta-neous Speech and Emotional Response Analysis. Thus the main goal of the present work is feature search in Spontaneous Speech oriented to pre-clinical evaluation for the definition of test for AD diagnosis by One-class classifier. One-class classifi-cation problem differs from multi-class classifier in one essen-tial aspect. In one-class classification it is assumed that only information of one of the classes, the target class, is available. In this work we explore the problem of imbalanced datasets that is particularly crucial in applications where the goal is to maximize recognition of the minority class as in medical diag-nosis. The use of information about outlier and Fractal Dimen-sion features improves the system performance.
Resumo:
OBJECTIVE: To identify and quantify sources of variability in scores on the speech, spatial, and qualities of hearing scale (SSQ) and its short forms among normal-hearing and hearing-impaired subjects using a French-language version of the SSQ. DESIGN: Multi-regression analyses of SSQ scores were performed using age, gender, years of education, hearing loss, and hearing-loss asymmetry as predictors. Similar analyses were performed for each subscale (Speech, Spatial, and Qualities), for several SSQ short forms, and for differences in subscale scores. STUDY SAMPLE: One hundred normal-hearing subjects (NHS) and 230 hearing-impaired subjects (HIS). RESULTS: Hearing loss in the better ear and hearing-loss asymmetry were the two main predictors of scores on the overall SSQ, the three main subscales, and the SSQ short forms. The greatest difference between the NHS and HIS was observed for the Speech subscale, and the NHS showed scores well below the maximum of 10. An age effect was observed mostly on the Speech subscale items, and the number of years of education had a significant influence on several Spatial and Qualities subscale items. CONCLUSION: Strong similarities between SSQ scores obtained across different populations and languages, and between SSQ and short forms, underline their potential international use.
Resumo:
This paper gives a full description of the phonetics and phonology of Traditional Cockney and Popular London speech, treating these varieties as constituting a continuum rather than two separate dialects. Exemplification of the vowels, diphthongs and consonants is provided, both in isolate words and in connected speech, along with their range of variation. The frequencies of the vowels have been charted on the basis of the pronunciation of three elderly male speakers. Regarding the consonants, there are detailed observations on the features typically associated with the linguistic varieties examined: strong aspiration of unvoiced plosives, glottalization, H-dropping, L-vocalization and TH-fronting. A section on prosody provides coverage of lexical stress, rhythm and intonation. The paper takes into account up-to-date research on these phenomena, but does not deal with the most recent vowel shifts, some of which form part of Multi-cultural London English.
Resumo:
Alzheimer׳s disease (AD) is the most common type of dementia among the elderly. This work is part of a larger study that aims to identify novel technologies and biomarkers or features for the early detection of AD and its degree of severity. The diagnosis is made by analyzing several biomarkers and conducting a variety of tests (although only a post-mortem examination of the patients’ brain tissue is considered to provide definitive confirmation). Non-invasive intelligent diagnosis techniques would be a very valuable diagnostic aid. This paper concerns the Automatic Analysis of Emotional Response (AAER) in spontaneous speech based on classical and new emotional speech features: Emotional Temperature (ET) and fractal dimension (FD). This is a pre-clinical study aiming to validate tests and biomarkers for future diagnostic use. The method has the great advantage of being non-invasive, low cost, and without any side effects. The AAER shows very promising results for the definition of features useful in the early diagnosis of AD.
Resumo:
Language acquisition is a complex process that requires the synergic involvement of different cognitive functions, which include extracting and storing the words of the language and their embedded rules for progressive acquisition of grammatical information. As has been shown in other fields that study learning processes, synchronization mechanisms between neuronal assemblies might have a key role during language learning. In particular, studying these dynamics may help uncover whether different oscillatory patterns sustain more item-based learning of words and rule-based learning from speech input. Therefore, we tracked the modulation of oscillatory neural activity during the initial exposure to an artificial language, which contained embedded rules. We analyzed both spectral power variations, as a measure of local neuronal ensemble synchronization, as well as phase coherence patterns, as an index of the long-range coordination of these local groups of neurons. Synchronized activity in the gamma band (2040 Hz), previously reported to be related to the engagement of selective attention, showed a clear dissociation of local power and phase coherence between distant regions. In this frequency range, local synchrony characterized the subjects who were focused on word identification and was accompanied by increased coherence in the theta band (48 Hz). Only those subjects who were able to learn the embedded rules showed increased gamma band phase coherence between frontal, temporal, and parietal regions.
Resumo:
In this paper, we present the Melodic Analysis of Speech method (MAS) that enables us to carry out complete and objective descriptions of a language's intonation, from a phonetic (melodic) point of view as well as from a phonological point of view. It is based on the acoustic-perceptive method by Cantero (2002), which has already been used in research on prosody in different languages. In this case, we present the results of its application in Spanish and Catalan.
Resumo:
This dissertation considers the segmental durations of speech from the viewpoint of speech technology, especially speech synthesis. The idea is that better models of segmental durations lead to higher naturalness and better intelligibility. These features are the key factors for better usability and generality of synthesized speech technology. Even though the studies are based on a Finnish corpus the approaches apply to all other languages as well. This is possibly due to the fact that most of the studies included in this dissertation are about universal effects taking place on utterance boundaries. Also the methods invented and used here are suitable for any other study of another language. This study is based on two corpora of news reading speech and sentences read aloud. The other corpus is read aloud by a 39-year-old male, whilst the other consists of several speakers in various situations. The use of two corpora is twofold: it involves a comparison of the corpora and a broader view on the matters of interest. The dissertation begins with an overview to the phonemes and the quantity system in the Finnish language. Especially, we are covering the intrinsic durations of phonemes and phoneme categories, as well as the difference of duration between short and long phonemes. The phoneme categories are presented to facilitate the problem of variability of speech segments. In this dissertation we cover the boundary-adjacent effects on segmental durations. In initial positions of utterances we find that there seems to be initial shortening in Finnish, but the result depends on the level of detail and on the individual phoneme. On the phoneme level we find that the shortening or lengthening only affects the very first ones at the beginning of an utterance. However, on average, the effect seems to shorten the whole first word on the word level. We establish the effect of final lengthening in Finnish. The effect in Finnish has been an open question for a long time, whilst Finnish has been the last missing piece for it to be a universal phenomenon. Final lengthening is studied from various angles and it is also shown that it is not a mere effect of prominence or an effect of speech corpus with high inter- and intra-speaker variation. The effect of final lengthening seems to extend from the final to the penultimate word. On a phoneme level it reaches a much wider area than the initial effect. We also present a normalization method suitable for corpus studies on segmental durations. The method uses an utterance-level normalization approach to capture the pattern of segmental durations within each utterance. This prevents the impact of various problematic variations within the corpora. The normalization is used in a study on final lengthening to show that the results on the effect are not caused by variation in the material. The dissertation shows an implementation and prowess of speech synthesis on a mobile platform. We find that the rule-based method of speech synthesis is a real-time software solution, but the signal generation process slows down the system beyond real time. Future aspects of speech synthesis on limited platforms are discussed. The dissertation considers ethical issues on the development of speech technology. The main focus is on the development of speech synthesis with high naturalness, but the problems and solutions are applicable to any other speech technology approaches.
Resumo:
The flow of information within modern information society has increased rapidly over the last decade. The major part of this information flow relies on the individual’s abilities to handle text or speech input. For the majority of us it presents no problems, but there are some individuals who would benefit from other means of conveying information, e.g. signed information flow. During the last decades the new results from various disciplines have all suggested towards the common background and processing for sign and speech and this was one of the key issues that I wanted to investigate further in this thesis. The basis of this thesis is firmly within speech research and that is why I wanted to design analogous test batteries for widely used speech perception tests for signers – to find out whether the results for signers would be the same as in speakers’ perception tests. One of the key findings within biology – and more precisely its effects on speech and communication research – is the mirror neuron system. That finding has enabled us to form new theories about evolution of communication, and it all seems to converge on the hypothesis that all communication has a common core within humans. In this thesis speech and sign are discussed as equal and analogical counterparts of communication and all research methods used in speech are modified for sign. Both speech and sign are thus investigated using similar test batteries. Furthermore, both production and perception of speech and sign are studied separately. An additional framework for studying production is given by gesture research using cry sounds. Results of cry sound research are then compared to results from children acquiring sign language. These results show that individuality manifests itself from very early on in human development. Articulation in adults, both in speech and sign, is studied from two perspectives: normal production and re-learning production when the apparatus has been changed. Normal production is studied both in speech and sign and the effects of changed articulation are studied with regards to speech. Both these studies are done by using carrier sentences. Furthermore, sign production is studied giving the informants possibility for spontaneous speech. The production data from the signing informants is also used as the basis for input in the sign synthesis stimuli used in sign perception test battery. Speech and sign perception were studied using the informants’ answers to questions using forced choice in identification and discrimination tasks. These answers were then compared across language modalities. Three different informant groups participated in the sign perception tests: native signers, sign language interpreters and Finnish adults with no knowledge of any signed language. This gave a chance to investigate which of the characteristics found in the results were due to the language per se and which were due to the changes in modality itself. As the analogous test batteries yielded similar results over different informant groups, some common threads of results could be observed. Starting from very early on in acquiring speech and sign the results were highly individual. However, the results were the same within one individual when the same test was repeated. This individuality of results represented along same patterns across different language modalities and - in some occasions - across language groups. As both modalities yield similar answers to analogous study questions, this has lead us to providing methods for basic input for sign language applications, i.e. signing avatars. This has also given us answers to questions on precision of the animation and intelligibility for the users – what are the parameters that govern intelligibility of synthesised speech or sign and how precise must the animation or synthetic speech be in order for it to be intelligible. The results also give additional support to the well-known fact that intelligibility in fact is not the same as naturalness. In some cases, as shown within the sign perception test battery design, naturalness decreases intelligibility. This also has to be taken into consideration when designing applications. All in all, results from each of the test batteries, be they for signers or speakers, yield strikingly similar patterns, which would indicate yet further support for the common core for all human communication. Thus, we can modify and deepen the phonetic framework models for human communication based on the knowledge obtained from the results of the test batteries within this thesis.
Resumo:
Polyacrylamide gel electrophoresis, SDS-PAGE system, was adjusted to detect the presence of additional whey in dairy beverages distributed in a Brazilian Government School Meals Program. Aqueous solutions of samples in 8 M urea were submitted to a polyacrylamide gel gradient (10% to 18%). Gel scans from electrophoresis patterns of previously adulterated milk samples showed that caseins peak areas decreased while peak areas of beta -lactoglobulin plus alpha -lactalbumin increased as the percentage of raw milk powder replaced by whey powder increased. The relative densitometer areas of caseins or beta -lactoglobulin plus alpha -lactalbumin plotted against the percentage of whey added to the raw milk showed a linear correlation coefficient square higher than 0.97. The caseins plot was used to determine the percentage of additional whey in 116 dairy beverages, chocolate or coffee flavor. Considering that the lowest relative caseins concentration found in commercial milk powder samples by the present method was 72%, the dairy beverages containing caseins percentages equal to or higher than this value were considered free of additional whey. Based on this criterion, about 49% of the coffee-flavor dairy beverages and 29% of the chocolate-flavor beverages, among all the samples analyzed were adulterated with whey protein to reach the total protein contents specified on their labels. The present method showed a sensitivity of 5% to additional whey.