Biblioteca Digital

971 resultados para Speech Processing

Improving speech transcription for Mandarin-english translation

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper describes the development of the CU-HTK Mandarin Speech-To-Text (STT) system and assesses its performance as part of a transcription-translation pipeline which converts broadcast Mandarin audio into English text. Recent improvements to the STT system are described and these give Character Error Rate (CER) gains of 14.3% absolute for a Broadcast Conversation (BC) task and 5.1% absolute for a Broadcast News (BN) task. The output of these STT systems is then post-processed, so that it consists of sentence-like segments, and translated into English text using a Statistical Machine Translation (SMT) system. The performance of the transcription-translation pipeline is evaluated using the Translation Edit Rate (TER) and BLEU metrics. It is shown that improving both the STT system and the post-STT segmentations can lower the TER scores by up to 5.3% absolute and increase the BLEU scores by up to 2.7% absolute. © 2007 IEEE.

Automatic transcription of conversational telephone speech

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper discusses the Cambridge University HTK (CU-HTK) system for the automatic transcription of conversational telephone speech. A detailed discussion of the most important techniques in front-end processing, acoustic modeling and model training, language and pronunciation modeling are presented. These include the use of conversation side based cepstral normalization, vocal tract length normalization, heteroscedastic linear discriminant analysis for feature projection, minimum phone error training and speaker adaptive training, lattice-based model adaptation, confusion network based decoding and confidence score estimation, pronunciation selection, language model interpolation, and class based language models. The transcription system developed for participation in the 2002 NIST Rich Transcription evaluations of English conversational telephone speech data is presented in detail. In this evaluation the CU-HTK system gave an overall word error rate of 23.9%, which was the best performance by a statistically significant margin. Further details on the derivation of faster systems with moderate performance degradation are discussed in the context of the 2002 CU-HTK 10 × RT conversational speech transcription system. © 2005 IEEE.

Lightly supervised recognition for automatic alignment of large coherent speech recordings

Relevância:

30.00% 30.00%

Publicador:

Canonical state models for automatic speech recognition

Relevância:

30.00% 30.00%

Publicador:

Autoregressive clustering for HMM speech synthesis

Relevância:

30.00% 30.00%

Publicador:

Asymptotically exact noise-corrupted speech likelihoods

Relevância:

30.00% 30.00%

Publicador:

Recent improvements to the Cambridge Arabic speech-to-text systems

Relevância:

30.00% 30.00%

Publicador:

Unsupervised cross-lingual speaker adaptation for HMM-based speech synthesis using two-pass decision tree construction

Relevância:

30.00% 30.00%

Publicador:

Probabilistic modelling of F0 in unvoiced regions in HMM-based speech synthesis

Relevância:

30.00% 30.00%

Publicador:

Training and adapting MLP features for Arabic speech recognition

Relevância:

30.00% 30.00%

Publicador:

Bayesian discriminative adaptation for speech recognition

Relevância:

30.00% 30.00%

Publicador:

Extended VTS for noise-robust speech recognition

Relevância:

30.00% 30.00%

Publicador:

Extended VTS for noise-robust speech recognition

Relevância:

30.00% 30.00%

Publicador:

Trajectory training considering global variance for HMM-based speech synthesis

Relevância:

30.00% 30.00%

Publicador:

Phonetic pronunciations for arabic speech-to-text systems

Relevância:

30.00% 30.00%

Publicador:

«
1
2
...
11
12
13
14
15
16
17
...
64
65
»