Biblioteca Digital

8 resultados para Perceptual Speech Evaluation

em Cambridge University Engineering Department Publications Database

Speech recognition evaluation: a review of the ARPA CSR programme

Relevância:

40.00% 40.00%

Publicador:

Veja mais

Speech recognition evaluation: a review of the U.S. CSR and LVCSR programmes

Relevância:

40.00% 40.00%

Publicador:

Veja mais

Development of the 2003 CU-HTK conversational telephone speech transcription system

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper describes the development of the 2003 CU-HTK large vocabulary speech recognition system for Conversational Telephone Speech (CTS). The system was designed based on a multi-pass, multi-branch structure where the output of all branches is combined using system combination. A number of advanced modelling techniques such as Speaker Adaptive Training, Heteroscedastic Linear Discriminant Analysis, Minimum Phone Error estimation and specially constructed Single Pronunciation dictionaries were employed. The effectiveness of each of these techniques and their potential contribution to the result of system combination was evaluated in the framework of a state-of-the-art LVCSR system with sophisticated adaptation. The final 2003 CU-HTK CTS system constructed from some of these models is described and its performance on the DARPA/NIST 2003 Rich Transcription (RT-03) evaluation test set is discussed.

Veja mais

Automatic transcription of conversational telephone speech

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper discusses the Cambridge University HTK (CU-HTK) system for the automatic transcription of conversational telephone speech. A detailed discussion of the most important techniques in front-end processing, acoustic modeling and model training, language and pronunciation modeling are presented. These include the use of conversation side based cepstral normalization, vocal tract length normalization, heteroscedastic linear discriminant analysis for feature projection, minimum phone error training and speaker adaptive training, lattice-based model adaptation, confusion network based decoding and confidence score estimation, pronunciation selection, language model interpolation, and class based language models. The transcription system developed for participation in the 2002 NIST Rich Transcription evaluations of English conversational telephone speech data is presented in detail. In this evaluation the CU-HTK system gave an overall word error rate of 23.9%, which was the best performance by a statistically significant margin. Further details on the derivation of faster systems with moderate performance degradation are discussed in the context of the 2002 CU-HTK 10 × RT conversational speech transcription system. © 2005 IEEE.

Veja mais

Multi-channel Bayesian background noise suppression using perceptual cost functions

Relevância:

30.00% 30.00%

Publicador:

Veja mais

Issues in annotation of the Czech spontaneous speech corpus in the MALACH project

Relevância:

30.00% 30.00%

Publicador:

Veja mais

IBM's 10xReal-time broadcast news transciption used in the 1999 hub4 evaluation

Relevância:

30.00% 30.00%

Publicador:

Veja mais

Real user evaluation of spoken dialogue systems using Amazon Mechanical Turk

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper describes a framework for evaluation of spoken dialogue systems. Typically, evaluation of dialogue systems is performed in a controlled test environment with carefully selected and instructed users. However, this approach is very demanding. An alternative is to recruit a large group of users who evaluate the dialogue systems in a remote setting under virtually no supervision. Crowdsourcing technology, for example Amazon Mechanical Turk (AMT), provides an efficient way of recruiting subjects. This paper describes an evaluation framework for spoken dialogue systems using AMT users and compares the obtained results with a recent trial in which the systems were tested by locally recruited users. The results suggest that the use of crowdsourcing technology is feasible and it can provide reliable results. Copyright © 2011 ISCA.

Veja mais

8 resultados para Perceptual Speech Evaluation

em Cambridge University Engineering Department Publications Database

Filtro por publicador