210 resultados para automatic generation
em Cambridge University Engineering Department Publications Database
Resumo:
A significant cost in obtaining acoustic training data is the generation of accurate transcriptions. For some sources close-caption data is available. This allows the use of lightly-supervised training techniques. However, for some sources and languages close-caption is not available. In these cases unsupervised training techniques must be used. This paper examines the use of unsupervised techniques for discriminative training. In unsupervised training automatic transcriptions from a recognition system are used for training. As these transcriptions may be errorful data selection may be useful. Two forms of selection are described, one to remove non-target language shows, the other to remove segments with low confidence. Experiments were carried out on a Mandarin transcriptions task. Two types of test data were considered, Broadcast News (BN) and Broadcast Conversations (BC). Results show that the gains from unsupervised discriminative training are highly dependent on the accuracy of the automatic transcriptions. © 2007 IEEE.
Resumo:
We present solutions to scattering problems for unsteady disturbances to a mean swirling flow in an annular duct with a rigid 'splitter'. This situation has application to rotor-stator interaction noise in aeroengines, where the flow downstream of the fan is swirling and bifurcates into the by-pass duct and the engine core. We also consider the trailing edge extension of this problem. Inviscid mean flow in a cylindrical annulus is considered, with both axial and swirling (azimuthal) velocity components. The presence of vorticity in the mean flow couples the acoustic and vorticity modes of irrotational flow. Instead we have one combined spectrum of acoustic-vorticity waves in which the 'sonic' and 'nearly-convected' modes are fully coupled. In addition to the aeroacoustics application the results offer insight into the behaviour of these acoustic-vorticity waves, and the precise nature of the coupling between the two types of mode. Two regimes are discussed in which progress has been made, one for a specialised mean flow, uniform axial flow and rigid body swirl, and a second regime in which the frequency is assumed large, valid for any axisymmetric mean flow. The Wiener-Hopf technique is used to solve the scattering problems mathematically, and we present numerical evaluations of these solutions. Several new effects are seen to arise due to the mean vorticity, in particular the generation of sound at a trailing edge due to the scattering of a nearly convected disturbance, in contrast to the way a convected gust silently passes a trailing edge in uniform mean flow.
Resumo:
This paper discusses the Cambridge University HTK (CU-HTK) system for the automatic transcription of conversational telephone speech. A detailed discussion of the most important techniques in front-end processing, acoustic modeling and model training, language and pronunciation modeling are presented. These include the use of conversation side based cepstral normalization, vocal tract length normalization, heteroscedastic linear discriminant analysis for feature projection, minimum phone error training and speaker adaptive training, lattice-based model adaptation, confusion network based decoding and confidence score estimation, pronunciation selection, language model interpolation, and class based language models. The transcription system developed for participation in the 2002 NIST Rich Transcription evaluations of English conversational telephone speech data is presented in detail. In this evaluation the CU-HTK system gave an overall word error rate of 23.9%, which was the best performance by a statistically significant margin. Further details on the derivation of faster systems with moderate performance degradation are discussed in the context of the 2002 CU-HTK 10 × RT conversational speech transcription system. © 2005 IEEE.