23 resultados para automatic music analysis
Resumo:
A novel technique for automated topographical analysis in the SEM has been investigated. It utilizes a 16-bit minicomputer arranged to act as an automatic focusing unit. The computer is coupled to the objective lens of the microscope, by means of a digital to analogue converter, and may regulate the excitation of the lens under program control. Further digital-to-analogue converters allow the computer to act as a programmable scan generator by applying ramp waveforms to the scan amplifiers, permitting the beam to be swept over a small sub-region of the field of interest. The video signal is sampled and applied to an analogue-to-digital converter; the resultant binary numbers are stored in computer memory as an array of values representing relative image intensities within a subregion. A differencing algorithm applied to the collected data allows the level of objective lens excitation to be found at which the sharpness of the image is optimized, and the excitation may be related to the working distance for that subregion through a previous calibration experiment. The sensitivity of the method for detecting small height changes is theoretically of the order of 1 μm.
Resumo:
This paper discusses the Cambridge University HTK (CU-HTK) system for the automatic transcription of conversational telephone speech. A detailed discussion of the most important techniques in front-end processing, acoustic modeling and model training, language and pronunciation modeling are presented. These include the use of conversation side based cepstral normalization, vocal tract length normalization, heteroscedastic linear discriminant analysis for feature projection, minimum phone error training and speaker adaptive training, lattice-based model adaptation, confusion network based decoding and confidence score estimation, pronunciation selection, language model interpolation, and class based language models. The transcription system developed for participation in the 2002 NIST Rich Transcription evaluations of English conversational telephone speech data is presented in detail. In this evaluation the CU-HTK system gave an overall word error rate of 23.9%, which was the best performance by a statistically significant margin. Further details on the derivation of faster systems with moderate performance degradation are discussed in the context of the 2002 CU-HTK 10 × RT conversational speech transcription system. © 2005 IEEE.