2 resultados para Vowels
em Cambridge University Engineering Department Publications Database
Resumo:
Boltzmann machines offer a new and exciting approach to automatic speech recognition, and provide a rigorous mathematical formalism for parallel computing arrays. In this paper we briefly summarize Boltzmann machine theory, and present results showing their ability to recognize both static and time-varying speech patterns. A machine with 2000 units was able to distinguish between the 11 steady-state vowels in English with an accuracy of 85%. The stability of the learning algorithm and methods of preprocessing and coding speech data before feeding it to the machine are also discussed. A new type of unit called a carry input unit, which involves a type of state-feedback, was developed for the processing of time-varying patterns and this was tested on a few short sentences. Use is made of the implications of recent work into associative memory, and the modelling of neural arrays to suggest a good configuration of Boltzmann machines for this sort of pattern recognition.
Resumo:
Human listeners can identify vowels regardless of speaker size, although the sound waves for an adult and a child speaking the ’same’ vowel would differ enormously. The differences are mainly due to the differences in vocal tract length (VTL) and glottal pulse rate (GPR) which are both related to body size. Automatic speech recognition machines are notoriously bad at understanding children if they have been trained on the speech of an adult. In this paper, we propose that the auditory system adapts its analysis of speech sounds, dynamically and automatically to the GPR and VTL of the speaker on a syllable-to-syllable basis. We illustrate how this rapid adaptation might be performed with the aid of a computational version of the auditory image model, and we propose that an auditory preprocessor of this form would improve the robustness of speech recognisers.