Biblioteca Digital

8 resultados para Speech Rate

em QUB Research Portal - Research Directory and Institutional Repository for Queen's University Belfast

Application of artificial neural network techniques to low bit-rate speech coding

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Research has been undertaken to investigate the use of artificial neural network (ANN) techniques to improve the performance of a low bit-rate vector transform coder. Considerable improvements in the perceptual quality of the coded speech have been obtained. New ANN-based methods for vector quantiser (VQ) design and for the adaptive updating of VQ codebook are introduced for use in speech coding applications.

Veja mais

THE APPLICATION OF ARTIFICIAL NEURAL NETWORK TECHNIQUES TO LOW BIT-RATE SPEECH CODING

Relevância:

40.00% 40.00%

Publicador:

Veja mais

A study of the effects of cochlear loss on the auditory brainstem response (ABR) specificity and false positive rate in retrocochlear assessment.

Relevância:

30.00% 30.00%

Publicador:

Veja mais

Inter-Frame Contextual Modelling For Visual Speech Recognition

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper, we present a new approach to visual speech recognition which improves contextual modelling by combining Inter-Frame Dependent and Hidden Markov Models. This approach captures contextual information in visual speech that may be lost using a Hidden Markov Model alone. We apply contextual modelling to a large speaker independent isolated digit recognition task, and compare our approach to two commonly adopted feature based techniques for incorporating speech dynamics. Results are presented from baseline feature based systems and the combined modelling technique. We illustrate that both of these techniques achieve similar levels of performance when used independently. However significant improvements in performance can be achieved through a combination of the two. In particular we report an improvement in excess of 17% relative Word Error Rate in comparison to our best baseline system.

Veja mais

Rise time and formant transition duration in the discrimination of speech sounds: The Ba–Wa distinction in developmental dyslexia

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Across languages, children with developmental dyslexia have a specific difficulty with the neural representation of the sound structure (phonological structure) of speech. One likely cause of their difficulties with phonology is a perceptual difficulty in auditory temporal processing (Tallal, 1980). Tallal (1980) proposed that basic auditory processing of brief, rapidly successive acoustic changes is compromised in dyslexia, thereby affecting phonetic discrimination (e.g. discriminating /b/ from /d/) via impaired discrimination of formant transitions (rapid acoustic changes in frequency and intensity). However, an alternative auditory temporal hypothesis is that the basic auditory processing of the slower amplitude modulation cues in speech is compromised (Goswami , 2002). Here, we contrast children's perception of a synthetic speech contrast (ba/wa) when it is based on the speed of the rate of change of frequency information (formant transition duration) versus the speed of the rate of change of amplitude modulation (rise time). We show that children with dyslexia have excellent phonetic discrimination based on formant transition duration, but poor phonetic discrimination based on envelope cues. The results explain why phonetic discrimination may be allophonic in developmental dyslexia (Serniclaes , 2004), and suggest new avenues for the remediation of developmental dyslexia. © 2010 Blackwell Publishing Ltd.

Veja mais

Very low bit-rate embedded color image coding with SPIHT

Relevância:

30.00% 30.00%

Publicador:

Veja mais

ASR emotional speech: Clarifying the issues and enhancing performance

Relevância:

30.00% 30.00%

Publicador:

Resumo:

There are multiple reasons to expect that recognising the verbal content of emotional speech will be a difficult problem, and recognition rates reported in the literature are in fact low. Including information about prosody improves recognition rate for emotions simulated by actors, but its relevance to the freer patterns of spontaneous speech is unproven. This paper shows that recognition rate for spontaneous emotionally coloured speech can be improved by using a language model based on increased representation of emotional utterances. The models are derived by adapting an already existing corpus, the British National Corpus (BNC). An emotional lexicon is used to identify emotionally coloured words, and sentences containing these words are recombined with the BNC to form a corpus with a raised proportion of emotional material. Using a language model based on that technique improves recognition rate by about 20%. (c) 2005 Elsevier Ltd. All rights reserved.

Veja mais

On the sum rate of ZF detectors in correlated K fading MIMO channels

Relevância:

30.00% 30.00%

Publicador:

Veja mais

8 resultados para Speech Rate

em QUB Research Portal - Research Directory and Institutional Repository for Queen's University Belfast

Filtro por publicador