Biblioteca Digital

16 resultados para Telephone.

em Cambridge University Engineering Department Publications Database

Development of the 2003 CU-HTK conversational telephone speech transcription system

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper describes the development of the 2003 CU-HTK large vocabulary speech recognition system for Conversational Telephone Speech (CTS). The system was designed based on a multi-pass, multi-branch structure where the output of all branches is combined using system combination. A number of advanced modelling techniques such as Speaker Adaptive Training, Heteroscedastic Linear Discriminant Analysis, Minimum Phone Error estimation and specially constructed Single Pronunciation dictionaries were employed. The effectiveness of each of these techniques and their potential contribution to the result of system combination was evaluated in the framework of a state-of-the-art LVCSR system with sophisticated adaptation. The final 2003 CU-HTK CTS system constructed from some of these models is described and its performance on the DARPA/NIST 2003 Rich Transcription (RT-03) evaluation test set is discussed.

Automatic transcription of conversational telephone speech

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper discusses the Cambridge University HTK (CU-HTK) system for the automatic transcription of conversational telephone speech. A detailed discussion of the most important techniques in front-end processing, acoustic modeling and model training, language and pronunciation modeling are presented. These include the use of conversation side based cepstral normalization, vocal tract length normalization, heteroscedastic linear discriminant analysis for feature projection, minimum phone error training and speaker adaptive training, lattice-based model adaptation, confusion network based decoding and confidence score estimation, pronunciation selection, language model interpolation, and class based language models. The transcription system developed for participation in the 2002 NIST Rich Transcription evaluations of English conversational telephone speech data is presented in detail. In this evaluation the CU-HTK system gave an overall word error rate of 23.9%, which was the best performance by a statistically significant margin. Further details on the derivation of faster systems with moderate performance degradation are discussed in the context of the 2002 CU-HTK 10 × RT conversational speech transcription system. © 2005 IEEE.

Acoustic training from heterogeneous data sources: experiments in Mandarin conversational telephone speech transcription

Relevância:

20.00% 20.00%

Publicador:

Development of the CUHTK 2004 Mandarin conversational telephone speech transcription system

Relevância:

20.00% 20.00%

Publicador:

Development of the 2003 CU-HTK conversational telephone speech transcription system

Relevância:

20.00% 20.00%

Publicador:

Development of the CUHTK 2004 RT04F Mandarin conversational telephone speech transcription system

Relevância:

20.00% 20.00%

Publicador:

A PLSA-based language model for conversational telephone speech

Relevância:

20.00% 20.00%

Publicador:

Generating and evaluating segmentations for automatic speech recognition of conversational telephone speech

Relevância:

20.00% 20.00%

Publicador:

Automatic transcription of conversational telephone speech: development of the CU-HTK 2002 system

Relevância:

20.00% 20.00%

Publicador:

New features in the CU-HTK system for transcription of conversational telephone speech

Relevância:

20.00% 20.00%

Publicador:

Large scale MMIE training for conversational telephone speech recognition

Relevância:

20.00% 20.00%

Publicador:

The 1998 HTK system for transcription of conversational telephone speech

Relevância:

20.00% 20.00%

Publicador:

Automatic transcription of conversational telephone speech

Relevância:

20.00% 20.00%

Publicador:

Calorimetry for power conversion in mobile telephone chargers

Relevância:

20.00% 20.00%

Publicador:

Paraphrastic language models

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In natural languages multiple word sequences can represent the same underlying meaning. Only modelling the observed surface word sequence can result in poor context coverage, for example, when using n-gram language models (LM). To handle this issue, this paper presents a novel form of language model, the paraphrastic LM. A phrase level transduction model that is statistically learned from standard text data is used to generate paraphrase variants. LM probabilities are then estimated by maximizing their marginal probability. Significant error rate reductions of 0.5%-0.6% absolute were obtained on a state-ofthe-art conversational telephone speech recognition task using a paraphrastic multi-level LM modelling both word and phrase sequences.

«
1
2
»