984 resultados para Decoding Speech Prosody
Resumo:
Identification of emotional facial expression and emotional prosody (i.e. speech melody) is often impaired in schizophrenia. For facial emotion identification, a recent study suggested that the relative deficit in schizophrenia is enhanced when the presented emotion is easier to recognize. It is unclear whether this effect is specific to face processing or part of a more general emotion recognition deficit.
Resumo:
Thesis (Master's)--University of Washington, 2016-06
Resumo:
Primary objective: To investigate the nature of the motor speech impairments and dysarthria that can arise subsequent to treatment for childhood mid-line cerebellar tumours (CMCT). Research design: The motor speech ability of six cases of children with CMCT was analysed using perceptual and physiological measures and compared with that of a group of non-neurologically impaired children matched for age and sex. Main outcome and results: Three of the children with CMCT were perceived to exhibit dysarthric speech, while the remaining three were judged to have normal speech. The speech disorder in three of the children with CMCT was marked by deviances in prosody, articulation and phonation. The underlying pathophysiology was linked to cerebellar damage and expressed as difficulty in co-ordinating the motor speech musculature as required for speech production. These deficits were not identified in the three non-dysarthric children with CMCT. Conclusion: Differential motor speech outcomes occur for children treated for CMCT and these are discussed within the realm of possible mechanisms responsible for these differences. The need for further investigation of the risk factors for development of motor speech impairment in children treated for CMCT is also highlighted.
Resumo:
In this paper, we present syllable-based duration modelling in the context of a prosody model for Standard Yorùbá (SY) text-to-speech (TTS) synthesis applications. Our prosody model is conceptualised around a modular holistic framework. This framework is implemented using the Relational Tree (R-Tree) techniques. An important feature of our R-Tree framework is its flexibility in that it facilitates the independent implementation of the different dimensions of prosody, i.e. duration, intonation, and intensity, using different techniques and their subsequent integration. We applied the Fuzzy Decision Tree (FDT) technique to model the duration dimension. In order to evaluate the effectiveness of FDT in duration modelling, we have also developed a Classification And Regression Tree (CART) based duration model using the same speech data. Each of these models was integrated into our R-Tree based prosody model. We performed both quantitative (i.e. Root Mean Square Error (RMSE) and Correlation (Corr)) and qualitative (i.e. intelligibility and naturalness) evaluations on the two duration models. The results show that CART models the training data more accurately than FDT. The FDT model, however, shows a better ability to extrapolate from the training data since it achieved a better accuracy for the test data set. Our qualitative evaluation results show that our FDT model produces synthesised speech that is perceived to be more natural than our CART model. In addition, we also observed that the expressiveness of FDT is much better than that of CART. That is because the representation in FDT is not restricted to a set of piece-wise or discrete constant approximation. We, therefore, conclude that the FDT approach is a practical approach for duration modelling in SY TTS applications. © 2006 Elsevier Ltd. All rights reserved.
Resumo:
This paper presents a novel intonation modelling approach and demonstrates its applicability using the Standard Yorùbá language. Our approach is motivated by the theory that abstract and realised forms of intonation and other dimensions of prosody should be modelled within a modular and unified framework. In our model, this framework is implemented using the Relational Tree (R-Tree) technique. The R-Tree is a sophisticated data structure for representing a multi-dimensional waveform in the form of a tree. Our R-Tree for an utterance is generated in two steps. First, the abstract structure of the waveform, called the Skeletal Tree (S-Tree), is generated using tone phonological rules for the target language. Second, the numerical values of the perceptually significant peaks and valleys on the S-Tree are computed using a fuzzy logic based model. The resulting points are then joined by applying interpolation techniques. The actual intonation contour is synthesised by Pitch Synchronous Overlap Technique (PSOLA) using the Praat software. We performed both quantitative and qualitative evaluations of our model. The preliminary results suggest that, although the model does not predict the numerical speech data as accurately as contemporary data-driven approaches, it produces synthetic speech with comparable intelligibility and naturalness. Furthermore, our model is easy to implement, interpret and adapt to other tone languages.
Resumo:
This study investigated the effects of an explicit individualized phonemic awareness intervention administered by a speech-language pathologist to 4 prekindergarten children with phonological speech sound disorders. Research has demonstrated that children with moderate-severe expressive phonological disorders are at-risk for poor literacy development because they often concurrently exhibit weaknesses in the development of phonological awareness skills (Rvachew, Ohberg, Grawburg, & Heyding, 2003). The research design chosen for this study was a single subject multiple probe design across subjects. After stable baseline measures, the participants received explicit instruction in each of the three phases separately and sequentially. Dependent measures included same-day tests for Phase I (Phoneme Identity), Phase II (Phoneme Blending), and Phase III (Phoneme Segmentation), and generalization and maintenance tests for all three phases. All 4 participants made substantial progress in all three phases. These skills were maintained during weekly and biweekly maintenance measures. Generalization measures indicated that the participants demonstrated some increases in their mean total number of correct responses in Phase II and Phase III baseline while the participants were in Phase I intervention, and more substantial increases in Phase III baseline while the participants were in Phase II intervention. Increased generalization from Phases II to III could likely be explained due to the response similarities in those two skills (Cooper, Heron, & Heward, 2007). Based upon the findings of this study, speech-language pathologists should evaluate phonological awareness in the children in their caseloads prior to kindergarten entry, and should allocate time during speech therapy to enhance phonological awareness and letter knowledge to support the development of both skills concurrently. Also, classroom teachers should collaborate with speech-language pathologists to identify at-risk students in their classrooms and successfully implement evidence-based phonemic awareness instruction. Future research should repeat this study including larger groups of children, children with combined speech and language delays, children of different ages, and ESOL students