19 resultados para phone

em Cambridge University Engineering Department Publications Database


Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper investigates a method of automatic pronunciation scoring for use in computer-assisted language learning (CALL) systems. The method utilizes a likelihood-based `Goodness of Pronunciation' (GOP) measure which is extended to include individual thresholds for each phone based on both averaged native confidence scores and on rejection statistics provided by human judges. Further improvements are obtained by incorporating models of the subject's native language and by augmenting the recognition networks to include expected pronunciation errors. The various GOP measures are assessed using a specially recorded database of non-native speakers which has been annotated to mark phone-level pronunciation errors. Since pronunciation assessment is highly subjective, a set of four performance measures has been designed, each of them measuring different aspects of how well computer-derived phone-level scores agree with human scores. These performance measures are used to cross-validate the reference annotations and to assess the basic GOP algorithm and its refinements. The experimental results suggest that a likelihood-based pronunciation scoring metric can achieve usable performance, especially after applying the various enhancements.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Chapter 6 A Population Perspective on Mobile Phone Related Tasks M. Bradley, S. Waller, J. Goodman-Deane, l. Hosking, R. Tenneti, PM Langdon and PJ Clarkson 6.1 Introduction For design to be truly inclusive, it needs to take into ...

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper introduces a novel method for the training of a complementary acoustic model with respect to set of given acoustic models. The method is based upon an extension of the Minimum Phone Error (MPE) criterion and aims at producing a model that makes complementary phone errors to those already trained. The technique is therefore called Complementary Phone Error (CPE) training. The method is evaluated using an Arabic large vocabulary continuous speech recognition task. Reductions in word error rate (WER) after combination with a CPE-trained system were obtained with up to 0.7% absolute for a system trained on 172 hours of acoustic data and up to 0.2% absolute for the final system trained on nearly 2000 hours of Arabic data.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper describes the development of the 2003 CU-HTK large vocabulary speech recognition system for Conversational Telephone Speech (CTS). The system was designed based on a multi-pass, multi-branch structure where the output of all branches is combined using system combination. A number of advanced modelling techniques such as Speaker Adaptive Training, Heteroscedastic Linear Discriminant Analysis, Minimum Phone Error estimation and specially constructed Single Pronunciation dictionaries were employed. The effectiveness of each of these techniques and their potential contribution to the result of system combination was evaluated in the framework of a state-of-the-art LVCSR system with sophisticated adaptation. The final 2003 CU-HTK CTS system constructed from some of these models is described and its performance on the DARPA/NIST 2003 Rich Transcription (RT-03) evaluation test set is discussed.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper discusses the development of the CU-HTK Mandarin Broadcast News (BN) transcription system. The Mandarin BN task includes a significant amount of English data. Hence techniques have been investigated to allow the same system to handle both Mandarin and English by augmenting the Mandarin training sets with English acoustic and language model training data. A range of acoustic models were built including models based on Gaussianised features, speaker adaptive training and feature-space MPE. A multi-branch system architecture is described in which multiple acoustic model types, alternate phone sets and segmentations can be used in a system combination framework to generate the final output. The final system shows state-of-the-art performance over a range of test sets. ©2006 British Crown Copyright.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper discusses the Cambridge University HTK (CU-HTK) system for the automatic transcription of conversational telephone speech. A detailed discussion of the most important techniques in front-end processing, acoustic modeling and model training, language and pronunciation modeling are presented. These include the use of conversation side based cepstral normalization, vocal tract length normalization, heteroscedastic linear discriminant analysis for feature projection, minimum phone error training and speaker adaptive training, lattice-based model adaptation, confusion network based decoding and confidence score estimation, pronunciation selection, language model interpolation, and class based language models. The transcription system developed for participation in the 2002 NIST Rich Transcription evaluations of English conversational telephone speech data is presented in detail. In this evaluation the CU-HTK system gave an overall word error rate of 23.9%, which was the best performance by a statistically significant margin. Further details on the derivation of faster systems with moderate performance degradation are discussed in the context of the 2002 CU-HTK 10 × RT conversational speech transcription system. © 2005 IEEE.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper presents the results of a study that specifically looks at the relationships between measured user capabilities and product demands in a sample of older and disabled users. An empirical study was conducted with 19 users performing tasks with four consumer products (a clock-radio, a mobile phone, a blender and a vacuum cleaner). The sensory, cognitive and motor capabilities of each user were measured using objective capability tests. The study yielded a rich dataset comprising capability measures, product demands, outcome measures (task times and errors), and subjective ratings of difficulty. Scatter plots were produced showing quantified product demands on user capabilities, together with subjective ratings of difficulty. The results are analysed in terms of the strength of correlations observed taking into account the limitations of the study sample. Directions for future research are also outlined. © 2011 Springer-Verlag.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The recent advances in urban wireless communications and protocols that spurred the development of city-wide wireless infrastructure motivated this research, since in many cases, construction sites are not conveniently located for wired connectivity. Large scale transportation projects for example, such as new highways, railroad tracks and the networks of utilities (power-lines, phone lines, mobile towers, etc) that usually follow them are constructed in areas where wired infrastructure for data exchange is often expensive and time-consuming to deploy. The communication difficulties that can be encountered in such construction sites can be addressed with a wireless communications link between the construction site and the decision-making office. This paper presents a case study on long-range, wireless communications suitable for data exchange between construction sites and engineering headquarters. The purpose of this study was to define the requirements for a reliable wireless communications model where common types of electronic construction data will be exchanged in a fast and efficient manner, and construction site personnel will be able to interact and share knowledge, information and electronic resources with the office staff.