959 resultados para Speech-language therapy
Resumo:
This work examines prosody modelling for the Standard Yorùbá (SY) language in the context of computer text-to-speech synthesis applications. The thesis of this research is that it is possible to develop a practical prosody model by using appropriate computational tools and techniques which combines acoustic data with an encoding of the phonological and phonetic knowledge provided by experts. Our prosody model is conceptualised around a modular holistic framework. The framework is implemented using the Relational Tree (R-Tree) techniques (Ehrich and Foith, 1976). R-Tree is a sophisticated data structure that provides a multi-dimensional description of a waveform. A Skeletal Tree (S-Tree) is first generated using algorithms based on the tone phonological rules of SY. Subsequent steps update the S-Tree by computing the numerical values of the prosody dimensions. To implement the intonation dimension, fuzzy control rules where developed based on data from native speakers of Yorùbá. The Classification And Regression Tree (CART) and the Fuzzy Decision Tree (FDT) techniques were tested in modelling the duration dimension. The FDT was selected based on its better performance. An important feature of our R-Tree framework is its flexibility in that it facilitates the independent implementation of the different dimensions of prosody, i.e. duration and intonation, using different techniques and their subsequent integration. Our approach provides us with a flexible and extendible model that can also be used to implement, study and explain the theory behind aspects of the phenomena observed in speech prosody.
Resumo:
This paper presents a novel prosody model in the context of computer text-to-speech synthesis applications for tone languages. We have demonstrated its applicability using the Standard Yorùbá (SY) language. Our approach is motivated by the theory that abstract and realised forms of various prosody dimensions should be modelled within a modular and unified framework [Coleman, J.S., 1994. Polysyllabic words in the YorkTalk synthesis system. In: Keating, P.A. (Ed.), Phonological Structure and Forms: Papers in Laboratory Phonology III, Cambridge University Press, Cambridge, pp. 293–324]. We have implemented this framework using the Relational Tree (R-Tree) technique. R-Tree is a sophisticated data structure for representing a multi-dimensional waveform in the form of a tree. The underlying assumption of this research is that it is possible to develop a practical prosody model by using appropriate computational tools and techniques which combine acoustic data with an encoding of the phonological and phonetic knowledge provided by experts. To implement the intonation dimension, fuzzy logic based rules were developed using speech data from native speakers of Yorùbá. The Fuzzy Decision Tree (FDT) and the Classification and Regression Tree (CART) techniques were tested in modelling the duration dimension. For practical reasons, we have selected the FDT for implementing the duration dimension of our prosody model. To establish the effectiveness of our prosody model, we have also developed a Stem-ML prosody model for SY. We have performed both quantitative and qualitative evaluations on our implemented prosody models. The results suggest that, although the R-Tree model does not predict the numerical speech prosody data as accurately as the Stem-ML model, it produces synthetic speech prosody with better intelligibility and naturalness. The R-Tree model is particularly suitable for speech prosody modelling for languages with limited language resources and expertise, e.g. African languages. Furthermore, the R-Tree model is easy to implement, interpret and analyse.
Resumo:
Background Current guidelines recommend oral anticoagulation therapy for patients with atrial fibrillation who are at moderate-to-high risk of stroke, however anticoagulation control (time in therapeutic range (TTR)) is dependent on many factors. Educational and behavioural interventions may impact on patients’ ability to maintain their International Normalised Ratio (INR) control. Objectives To evaluate the effects on TTR of educational and behavioural interventions for oral anticoagulation therapy (OAT) in patients with atrial fibrillation (AF). Search methods We searched the Cochrane Central Register of Controlled Trials (CENTRAL) and the Database of Abstracts of Reviews of Effects (DARE) in The Cochrane Library (2012, Issue 7 of 12), MEDLINE Ovid (1950 to week 4 July 2012), EMBASE Classic + EMBASE Ovid (1947 to Week 31 2012), PsycINFO Ovid (1806 to 2012 week 5 July) on 8 August 2012 and CINAHL Plus with Full Text EBSCO (to August 2012) on 9 August 2012. We applied no language restrictions. Selection criteria The primary outcome analysed was TTR. Secondary outcomes included decision conflict (patient's uncertainty in making health-related decisions), percentage of INRs in the therapeutic range, major bleeding, stroke and thromboembolic events, patient knowledge, patient satisfaction, quality of life (QoL), and anxiety. Data collection and analysis The two review authors independently extracted data. Where insufficient data were present to conduct a meta-analysis, effect sizes and confidence intervals (CIs) of the included studies were reported. Data were pooled for two outcomes, TTR and decision conflict. Main results Eight trials with a total of 1215 AF patients (number of AF participants included in the individual trials ranging from 14 to 434) were included within the review. Studies included education, decision aids, and self-monitoring plus education. For the primary outcome of TTR, data for the AF participants in two self-monitoring plus education trials were pooled and did not favour self-monitoring plus education or usual care in improving TTR, with a mean difference of 6.31 (95% CI -5.63 to 18.25). For the secondary outcome of decision conflict, data from two decision aid trials favoured usual care over the decision aid in terms of reducing decision conflict, with a mean difference of -0.1 (95% CI -0.2 to -0.02). Authors' conclusions This review demonstrated that there is insufficient evidence to draw definitive conclusions regarding the impact of educational or behavioural interventions on TTR in AF patients receiving OAT. Thus, more trials are needed to examine the impact of interventions on anticoagulation control in AF patients and the mechanisms by which they are successful. It is also important to explore the psychological implications for patients suffering from this long-term chronic condition.
Resumo:
This paper reports some of the more frequent language changes in Panjabi, the first language of bilingual Panjabi/English children in the West Midlands, UK. Spontaneous spoken data were collected in schools across both languages in three formatted elicitation procedures from 50 bilingual Panjabi/English-speaking children, aged 6–7 years old. Panjabi data from the children is analysed for lexical borrowings and code-switching with English. Several changes of vocabulary and word grammar patterns in Panjabi are identified, many due to interaction with English, and some due to developmental features of Panjabi. There is also evidence of pervasive changes of word order, suggesting a shift in Panjabi word order to that of English. Lexical choice is discussed in terms of language change rather than language deficit. The implications of a normative framework for comparison are explored. A psycholinguistic model interprets grammatical changes in Panjabi.
Resumo:
Current models of word production assume that words are stored as linear sequences of phonemes which are structured into syllables only at the moment of production. This is because syllable structure is always recoverable from the sequence of phonemes. In contrast, we present theoretical and empirical evidence that syllable structure is lexically represented. Storing syllable structure would have the advantage of making representations more stable and resistant to damage. On the other hand, re-syllabifications affect only a minimal part of phonological representations and occur only in some languages and depending on speech register. Evidence for these claims comes from analyses of aphasic errors which not only respect phonotactic constraints, but also avoid transformations which move the syllabic structure of the word further away from the original structure, even when equating for segmental complexity. This is true across tasks, types of errors, and, crucially, types of patients. The same syllabic effects are shown by apraxic patients and by phonological patients who have more central difficulties in retrieving phonological representations. If syllable structure was only computed after phoneme retrieval, it would have no way to influence the errors of phonological patients. Our results have implications for psycholinguistic and computational models of language as well as for clinical and educational practices.
Resumo:
background Current guidelines recommend oral anticoagulation therapy for patients with atrial fibrillation who are at moderate-to-high risk of stroke, however anticoagulation control (time in therapeutic range (TTR)) is dependent on many factors. Educational and behavioural interventions may impact on patients’ ability to maintain their International Normalised Ratio (INR) control. Objectives To evaluate the effects on TTR of educational and behavioural interventions for oral anticoagulation therapy (OAT) in patients with atrial fibrillation (AF). Search methods We searched the Cochrane Central Register of Controlled Trials (CENTRAL) and the Database of Abstracts of Reviews of Effects (DARE) in The Cochrane Library (2012, Issue 7 of 12), MEDLINE Ovid (1950 to week 4 July 2012), EMBASE Classic + EMBASE Ovid (1947 to Week 31 2012), PsycINFO Ovid (1806 to 2012 week 5 July) on 8 August 2012 and CINAHL Plus with Full Text EBSCO (to August 2012) on 9 August 2012. We applied no language restrictions. Selection criteria The primary outcome analysed was TTR. Secondary outcomes included decision conflict (patient's uncertainty in making health-related decisions), percentage of INRs in the therapeutic range, major bleeding, stroke and thromboembolic events, patient knowledge, patient satisfaction, quality of life (QoL), and anxiety. Data collection and analysis The two review authors independently extracted data. Where insufficient data were present to conduct a meta-analysis, effect sizes and confidence intervals (CIs) of the included studies were reported. Data were pooled for two outcomes, TTR and decision conflict. Main results Eight trials with a total of 1215 AF patients (number of AF participants included in the individual trials ranging from 14 to 434) were included within the review. Studies included education, decision aids, and self-monitoring plus education. For the primary outcome of TTR, data for the AF participants in two self-monitoring plus education trials were pooled and did not favour self-monitoring plus education or usual care in improving TTR, with a mean difference of 6.31 (95% CI -5.63 to 18.25). For the secondary outcome of decision conflict, data from two decision aid trials favoured usual care over the decision aid in terms of reducing decision conflict, with a mean difference of -0.1 (95% CI -0.2 to -0.02). Authors' conclusions This review demonstrated that there is insufficient evidence to draw definitive conclusions regarding the impact of educational or behavioural interventions on TTR in AF patients receiving OAT. Thus, more trials are needed to examine the impact of interventions on anticoagulation control in AF patients and the mechanisms by which they are successful. It is also important to explore the psychological implications for patients suffering from this long-term chronic condition.
Resumo:
Aim: Sex chromosome aneuploidies increase the risk of spoken or written language disorders but individuals with specific language impairment (SLI) or dyslexia do not routinely undergo cytogenetic analysis. We assess the frequency of sex chromosome aneuploidies in individuals with language impairment or dyslexia. Method: Genome-wide single nucleotide polymorphism genotyping was performed in three sample sets: a clinical cohort of individuals with speech and language deficits (87 probands: 61 males, 26 females; age range 4 to 23 years), a replication cohort of individuals with SLI, from both clinical and epidemiological samples (209 probands: 139 males, 70 females; age range 4 to 17 years), and a set of individuals with dyslexia (314 probands: 224 males, 90 females; age range 7 to 18 years). Results: In the clinical language-impaired cohort, three abnormal karyotypic results were identified in probands (proband yield 3.4%). In the SLI replication cohort, six abnormalities were identified providing a consistent proband yield (2.9%). In the sample of individuals with dyslexia, two sex chromosome aneuploidies were found giving a lower proband yield of 0.6%. In total, two XYY, four XXY (Klinefelter syndrome), three XXX, one XO (Turner syndrome), and one unresolved karyotype were identified. Interpretation: The frequency of sex chromosome aneuploidies within each of the three cohorts was increased over the expected population frequency (approximately 0.25%) suggesting that genetic testing may prove worthwhile for individuals with language and literacy problems and normal non-verbal IQ. Early detection of these aneuploidies can provide information and direct the appropriate management for individuals. © 2013 The Authors. Developmental Medicine & Child Neurology published by John Wiley & Sons Ltd on behalf of Mac Keith Press.
Resumo:
In this paper, we present syllable-based duration modelling in the context of a prosody model for Standard Yorùbá (SY) text-to-speech (TTS) synthesis applications. Our prosody model is conceptualised around a modular holistic framework. This framework is implemented using the Relational Tree (R-Tree) techniques. An important feature of our R-Tree framework is its flexibility in that it facilitates the independent implementation of the different dimensions of prosody, i.e. duration, intonation, and intensity, using different techniques and their subsequent integration. We applied the Fuzzy Decision Tree (FDT) technique to model the duration dimension. In order to evaluate the effectiveness of FDT in duration modelling, we have also developed a Classification And Regression Tree (CART) based duration model using the same speech data. Each of these models was integrated into our R-Tree based prosody model. We performed both quantitative (i.e. Root Mean Square Error (RMSE) and Correlation (Corr)) and qualitative (i.e. intelligibility and naturalness) evaluations on the two duration models. The results show that CART models the training data more accurately than FDT. The FDT model, however, shows a better ability to extrapolate from the training data since it achieved a better accuracy for the test data set. Our qualitative evaluation results show that our FDT model produces synthesised speech that is perceived to be more natural than our CART model. In addition, we also observed that the expressiveness of FDT is much better than that of CART. That is because the representation in FDT is not restricted to a set of piece-wise or discrete constant approximation. We, therefore, conclude that the FDT approach is a practical approach for duration modelling in SY TTS applications. © 2006 Elsevier Ltd. All rights reserved.
Resumo:
This paper presents a novel intonation modelling approach and demonstrates its applicability using the Standard Yorùbá language. Our approach is motivated by the theory that abstract and realised forms of intonation and other dimensions of prosody should be modelled within a modular and unified framework. In our model, this framework is implemented using the Relational Tree (R-Tree) technique. The R-Tree is a sophisticated data structure for representing a multi-dimensional waveform in the form of a tree. Our R-Tree for an utterance is generated in two steps. First, the abstract structure of the waveform, called the Skeletal Tree (S-Tree), is generated using tone phonological rules for the target language. Second, the numerical values of the perceptually significant peaks and valleys on the S-Tree are computed using a fuzzy logic based model. The resulting points are then joined by applying interpolation techniques. The actual intonation contour is synthesised by Pitch Synchronous Overlap Technique (PSOLA) using the Praat software. We performed both quantitative and qualitative evaluations of our model. The preliminary results suggest that, although the model does not predict the numerical speech data as accurately as contemporary data-driven approaches, it produces synthetic speech with comparable intelligibility and naturalness. Furthermore, our model is easy to implement, interpret and adapt to other tone languages.
Resumo:
In this paper we present the design and analysis of an intonation model for text-to-speech (TTS) synthesis applications using a combination of Relational Tree (RT) and Fuzzy Logic (FL) technologies. The model is demonstrated using the Standard Yorùbá (SY) language. In the proposed intonation model, phonological information extracted from text is converted into an RT. RT is a sophisticated data structure that represents the peaks and valleys as well as the spatial structure of a waveform symbolically in the form of trees. An initial approximation to the RT, called Skeletal Tree (ST), is first generated algorithmically. The exact numerical values of the peaks and valleys on the ST is then computed using FL. Quantitative analysis of the result gives RMSE of 0.56 and 0.71 for peak and valley respectively. Mean Opinion Scores (MOS) of 9.5 and 6.8, on a scale of 1 - -10, was obtained for intelligibility and naturalness respectively.
Resumo:
In this report we summarize the state-of-the-art of speech emotion recognition from the signal processing point of view. On the bases of multi-corporal experiments with machine-learning classifiers, the observation is made that existing approaches for supervised machine learning lead to database dependent classifiers which can not be applied for multi-language speech emotion recognition without additional training because they discriminate the emotion classes following the used training language. As there are experimental results showing that Humans can perform language independent categorisation, we made a parallel between machine recognition and the cognitive process and tried to discover the sources of these divergent results. The analysis suggests that the main difference is that the speech perception allows extraction of language independent features although language dependent features are incorporated in all levels of the speech signal and play as a strong discriminative function in human perception. Based on several results in related domains, we have suggested that in addition, the cognitive process of emotion-recognition is based on categorisation, assisted by some hierarchical structure of the emotional categories, existing in the cognitive space of all humans. We propose a strategy for developing language independent machine emotion recognition, related to the identification of language independent speech features and the use of additional information from visual (expression) features.
Resumo:
The paper presents the history, structure and ongoing activities of the Institute for Bulgarian Language of Bulgarian Academy of Sciences.
Resumo:
Maternal vocal stimulation plays a vital role in infants’ language acquisition. Contingent maternal imitation and contingent motherese speech were used in an alternating sequence as reinforcers to a 12 month-old infant’s canonical babbling. Both vocal contingencies function as reinforcers; however, motherese speech produced the highest frequency of canonical babbling.
Resumo:
Recently, blood oxygen level-dependent (BOLD) functional magnetic resonance imaging (fMRI) has become a routine clinical procedure for localization of language and motor brain regions and has been replacing more invasive preoperative procedures. However, the fMRI results from these tasks are not always reproducible even from the same patient. Evaluating the reproducibility of language and speech mapping is especially complicated due to the complex brain circuitry that may become activated during the functional task. Non-language areas such as sensory, attention, decision-making, and motor brain regions may also be activated in addition to the specific language regions during a traditional sentence-completion task. In this study, I test a new approach, which utilizes 4-minute video-based tasks, to map language and speech brain regions for patients undergoing brain surgery. Results from 35 subjects have shown that the video-based task activates Wernicke’s area, as well as Broca’s area in most subjects. The computed laterality indices, which indicate the dominant hemisphere from that functional task, have indicated left dominance from the video-based tasks. This study has shown that the video-based task may be an alternative method for localization of language and speech brain regions for patients who are unable to complete the sentence-completion task.
Resumo:
How do infants learn word meanings? Research has established the impact of both parent and child behaviors on vocabulary development, however the processes and mechanisms underlying these relationships are still not fully understood. Much existing literature focuses on direct paths to word learning, demonstrating that parent speech and child gesture use are powerful predictors of later vocabulary. However, an additional body of research indicates that these relationships don’t always replicate, particularly when assessed in different populations, contexts, or developmental periods.
The current study examines the relationships between infant gesture, parent speech, and infant vocabulary over the course of the second year (10-22 months of age). Through the use of detailed coding of dyadic mother-child play interactions and a combination of quantitative and qualitative data analytic methods, the process of communicative development was explored. Findings reveal non-linear patterns of growth in both parent speech content and child gesture use. Analyses of contingency in dyadic interactions reveal that children are active contributors to communicative engagement through their use of gestures, shaping the type of input they receive from parents, which in turn influences child vocabulary acquisition. Recommendations for future studies and the use of nuanced methodologies to assess changes in the dynamic system of dyadic communication are discussed.