924 resultados para acoustic speech recognition system


Relevância:

40.00% 40.00%

Publicador:

Resumo:

Found also in Appendix to Congressional globe, 28th Congress, 1st session. v. 13, p. 253-258. Washington, 1844. Serial no. 83-84.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Purpose: This pilot study explored the feasibility and effectiveness of an Internet-based telerehabilitation application for the assessment of motor speech disorders in adults with acquired neurological impairment. Method: Using a counterbalanced, repeated measures research design, 2 speech-language pathologists assessed 19 speakers with dysarthria on a battery of perceptual assessments. The assessments included a 19-item version of the Frenchay Dysarthria Assessment (FDA; P. Enderby, 1983), the Assessment of Intelligibility of Dysarthric Speech (K. M. Yorkston & D. R. Beukelman, 1981), perceptual analysis of a speech sample, and an overall rating of severity of the dysarthria. One assessment was conducted in the traditional face-to-face manner, whereas the other assessment was conducted using an online, custom-built telerehabilitation application. This application enabled real-time videoconferencing at 128 kb/s and the transfer of store-and-forward audio and video data between the speaker and speech-language pathologist sites. The assessment methods were compared using the J.M.Bland and D.G.Altman (1986, 1999) limits-of-agreement method and percentage level of agreement between the 2 methods. Results: Measurements of severity of dysarthria, percentage intelligibility in sentences, and most perceptual ratings made in the telerehabilitation environment were found to fall within the clinically acceptable criteria. However, several ratings on the FDA were not comparable between the environments, and explanations for these results were explored. Conclusions: The online assessment of motor speech disorders using an Internet-based telerehabilitation system is feasible. This study suggests that with additional refinement of the technology and assessment protocols, reliable assessment of motor speech disorders over the Internet is possible. Future research methods are outlined.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Previous investigations employing electropalatography (EPG) have identified articulatory timing deficits in individuals with acquired dysarthria. However, this technology is yet to be applied to the articulatory timing disturbance present in Parkinson's disease (PD). As a result, the current investigation aimed to use EPG to comprehensively examine the temporal aspects of articulation in a group of nine individuals with PD at sentence, word and segment level. This investigation followed on from a prior study (McAuliffe, Ward and Murdoch) and similarly, aimed to compare the results of the participants with PD to a group of aged (n=7) and young controls (n=8) to determine if ageing contributed to any articulatory timing deficits observed. Participants were required to read aloud the phrase I saw a ___ today'' with the EPG palate in-situ. Target words included the consonants /1/, /s/ and /t/ in initial position in both the /i/ and /a/ vowel environments. Perceptual investigation of speech rate was conducted in addition to objective measurement of sentence, word and segment duration. Segment durations included the total segment length and duration of the approach, closure/constriction and release phases of EPG consonant production. Results of the present study revealed impaired speech rate, perceptually, in the group with PD. However, this was not confirmed objectively. Electropalatographic investigation of segment durations indicated that, in general, the group with PD demonstrated segment durations consistent with the control groups. Only one significant difference was noted, with the group with PD exhibiting significantly increased duration of the release phase for /1a/ when compared to both the control groups. It is, therefore, possible that EPG failed to detect lingual movement impairment as it does not measure the complete tongue movement towards and away from the hard palate. Furthermore, the contribution of individual variation to the present findings should not be overlooked.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Although developmental increases in the size of the position effect within a mispronunciation detection task have been interpreted as consistent with a view of the lexical restructuring process as protracted, the position effect itself might not be reliable. The current research examined the effects of position and clarity of acoustic-phonetic information on sensitivity to mispronounced onsets in 5- and 6-year-olds and adults. Both children and adults showed a position effect only when mispronunciations also differed in the amount of relevant acoustic-phonetic information. Adults' sensitivity to mispronounced second-syllable onsets also reflected the availability of acoustic-phonetic information. The implications of these findings are discussed in relation to the lexical restructuring hypothesis. (c) 2006 Elsevier Inc. All rights reserved.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Speech comprises dynamic and heterogeneous acoustic elements, yet it is heard as a single perceptual stream even when accompanied by other sounds. The relative contributions of grouping “primitives” and of speech-specific grouping factors to the perceptual coherence of speech are unclear, and the acoustical correlates of the latter remain unspecified. The parametric manipulations possible with simplified speech signals, such as sine-wave analogues, make them attractive stimuli to explore these issues. Given that the factors governing perceptual organization are generally revealed only where competition operates, the second-formant competitor (F2C) paradigm was used, in which the listener must resist competition to optimize recognition [Remez et al., Psychol. Rev. 101, 129-156 (1994)]. Three-formant (F1+F2+F3) sine-wave analogues were derived from natural sentences and presented dichotically (one ear = F1+F2C+F3; opposite ear = F2). Different versions of F2C were derived from F2 using separate manipulations of its amplitude and frequency contours. F2Cs with time-varying frequency contours were highly effective competitors, regardless of their amplitude characteristics. In contrast, F2Cs with constant frequency contours were completely ineffective. Competitor efficacy was not due to energetic masking of F3 by F2C. These findings indicate that modulation of the frequency, but not the amplitude, contour is critical for across-formant grouping.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

This paper presents a novel prosody model in the context of computer text-to-speech synthesis applications for tone languages. We have demonstrated its applicability using the Standard Yorùbá (SY) language. Our approach is motivated by the theory that abstract and realised forms of various prosody dimensions should be modelled within a modular and unified framework [Coleman, J.S., 1994. Polysyllabic words in the YorkTalk synthesis system. In: Keating, P.A. (Ed.), Phonological Structure and Forms: Papers in Laboratory Phonology III, Cambridge University Press, Cambridge, pp. 293–324]. We have implemented this framework using the Relational Tree (R-Tree) technique. R-Tree is a sophisticated data structure for representing a multi-dimensional waveform in the form of a tree. The underlying assumption of this research is that it is possible to develop a practical prosody model by using appropriate computational tools and techniques which combine acoustic data with an encoding of the phonological and phonetic knowledge provided by experts. To implement the intonation dimension, fuzzy logic based rules were developed using speech data from native speakers of Yorùbá. The Fuzzy Decision Tree (FDT) and the Classification and Regression Tree (CART) techniques were tested in modelling the duration dimension. For practical reasons, we have selected the FDT for implementing the duration dimension of our prosody model. To establish the effectiveness of our prosody model, we have also developed a Stem-ML prosody model for SY. We have performed both quantitative and qualitative evaluations on our implemented prosody models. The results suggest that, although the R-Tree model does not predict the numerical speech prosody data as accurately as the Stem-ML model, it produces synthetic speech prosody with better intelligibility and naturalness. The R-Tree model is particularly suitable for speech prosody modelling for languages with limited language resources and expertise, e.g. African languages. Furthermore, the R-Tree model is easy to implement, interpret and analyse.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

* This work was financially supported by RFBR-04-01-00858.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

In this work the new pattern recognition method based on the unification of algebraic and statistical approaches is described. The main point of the method is the voting procedure upon the statistically weighted regularities, which are linear separators in two-dimensional projections of feature space. The report contains brief description of the theoretical foundations of the method, description of its software realization and the results of series of experiments proving its usefulness in practical tasks.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The neural-like growing networks used in the intelligent system of recognition of images are under consideration in this paper. All operations made over the image on a pre-design stage and also classification and storage of the information about the images and their further identification are made extremely by mechanisms of neural-like networks without usage of complex algorithms requiring considerable volumes of calculus. At the conforming hardware support the neural network methods allow considerably to increase the effectiveness of the solution of the given class of problems, saving a high accuracy of result and high level of response, both in a mode of training, and in a mode of identification.