103 resultados para Voice training
Resumo:
The paper describes the architecture of VODIS, a voice operated database inquiry system, and presents some experiments which investigate the effects on performance of varying the level of a priori syntactic constraints. The VODIS system includes a novel mechanism for incorporating context-free grammatical constraints directly into the word recognition algorithm. This allows the degree of a priori constraint to be smoothly varied and provides for the controlled generation of multiple alternatives. The results show that when the spoken input deviates from the predefined task grammar, a combination of weak a priori syntax rules in conjunction with full a posteriori parsing on a lattice of alternative word matches provides the most robust recognition performance. © 1991.
Resumo:
The use of hidden Markov models is placed in a connectionist framework, and an alternative approach to improving their ability to discriminate between classes is described. Using a network style of training, a measure of discrimination based on the a posteriori probability of state occupation is proposed, and the theory for its optimization using error back-propagation and gradient ascent is presented. The method is shown to be numerically well behaved, and results are presented which demonstrate that when using a simple threshold test on the probability of state occupation, the proposed optimization scheme leads to improved recognition performance.
Resumo:
This paper reports our experiences with a phoneme recognition system for the TIMIT database which uses multiple mixture continuous density monophone HMMs trained using MMI. A comprehensive set of results are presented comparing the ML and MMI training criteria for both diagonal and full covariance models. These results using simple monophone HMMs show clear performance gains achieved by MMI training, and are comparable to the best reported by others including those which use context-dependent models. In addition, the paper discusses a number of performance and implementation issues which are crucial to successful MMI training.
Resumo:
This paper describes work performed as part of the U.K. Alvey sponsored Voice Operated Database Inquiry System (VODIS) project in the area of intelligent dialogue control. The principal aims of the work were to develop a habitable interface for the untrained user; to investigate the degree to which dialogue control can be used to compensate for deficiencies in recognition performance; and to examine the requirements on dialogue control for generating natural speech output. A data-driven methodology is described based on the use of frames in which dialogue topics are organized hierarchically. The concept of a dynamically adjustable scope is introduced to permit adaptation to recognizer performance and the use of historical and hierarchical contexts are described to facilitate the construction of contextually relevant output messages. © 1989.
Resumo:
We introduce a new algorithm to automatically identify the time and pixel location of foot contact events in high speed video of sprinters. We use this information to autonomously synchronise and overlay multiple recorded performances to provide feedback to athletes and coaches during their training sessions. The algorithm exploits the variation in speed of different parts of the body during sprinting. We use an array of foreground accumulators to identify short-term static pixels and a temporal analysis of the associated static regions to identify foot contacts. We evaluated the technique using 13 videos of three sprinters. It successfully identifed 55 of the 56 contacts, with a mean localisation error of 1.39±1.05 pixels. Some videos were also seen to produce additional, spurious contacts. We present heuristics to help identify the true contacts. © 2011 Springer-Verlag Berlin Heidelberg.