134 resultados para Body Language
em Cambridge University Engineering Department Publications Database
Discriminative language model adaptation for Mandarin broadcast speech transcription and translation
Resumo:
This paper investigates unsupervised test-time adaptation of language models (LM) using discriminative methods for a Mandarin broadcast speech transcription and translation task. A standard approach to adapt interpolated language models to is to optimize the component weights by minimizing the perplexity on supervision data. This is a widely made approximation for language modeling in automatic speech recognition (ASR) systems. For speech translation tasks, it is unclear whether a strong correlation still exists between perplexity and various forms of error cost functions in recognition and translation stages. The proposed minimum Bayes risk (MBR) based approach provides a flexible framework for unsupervised LM adaptation. It generalizes to a variety of forms of recognition and translation error metrics. LM adaptation is performed at the audio document level using either the character error rate (CER), or translation edit rate (TER) as the cost function. An efficient parameter estimation scheme using the extended Baum-Welch (EBW) algorithm is proposed. Experimental results on a state-of-the-art speech recognition and translation system are presented. The MBR adapted language models gave the best recognition and translation performance and reduced the TER score by up to 0.54% absolute. © 2007 IEEE.
Resumo:
In speech recognition systems language model (LMs) are often constructed by training and combining multiple n-gram models. They can be either used to represent different genres or tasks found in diverse text sources, or capture stochastic properties of different linguistic symbol sequences, for example, syllables and words. Unsupervised LM adaptation may also be used to further improve robustness to varying styles or tasks. When using these techniques, extensive software changes are often required. In this paper an alternative and more general approach based on weighted finite state transducers (WFSTs) is investigated for LM combination and adaptation. As it is entirely based on well-defined WFST operations, minimum change to decoding tools is needed. A wide range of LM combination configurations can be flexibly supported. An efficient on-the-fly WFST decoding algorithm is also proposed. Significant error rate gains of 7.3% relative were obtained on a state-of-the-art broadcast audio recognition task using a history dependently adapted multi-level LM modelling both syllable and word sequences. ©2010 IEEE.
Resumo:
Salmonella enterica causes a range of life-threatening diseases in humans and animals worldwide. Current treatments for S. enterica infections are not sufficiently effective, and there is a need to develop new vaccines and therapeutics. An understanding of how S. enterica spreads in tissues has very important implications for targeting bacteria with vaccine-induced immune responses and antimicrobial drugs. Development of new control strategies would benefit from a more sophisticated evaluation of bacterial location, spatiotemporal patterns of spread and distribution in the tissues, and sites of microbial persistence. We review here recent studies of S. enterica serovar Typhimurium (S. Typhimurium) infections in mice, an established model of systemic typhoid fever in humans, which suggest that continuous bacterial spread to new infection foci and host phagocytes is an essential trait in the virulence of S. enterica during systemic infections. We further highlight how infections within host tissues are truly heterogeneous processes despite the fact that they are caused by the expansion of a genetically homogeneous microbial population. We conclude by discussing how understanding the within-host quantitative, spatial and temporal dynamics of S. enterica infections might aid the development of novel targeted preventative measures and drug regimens.
Resumo:
Existing devices for communicating information to computers are bulky, slow to use, or unreliable. Dasher is a new interface incorporating language modelling and driven by continuous two-dimensional gestures, e.g. a mouse, touchscreen, or eye-tracker. Tests have shown that this device can be used to enter text at a rate of up to 34 words per minute, compared with typical ten-finger keyboard typing of 40-60 words per minute. Although the interface is slower than a conventional keyboard, it is small and simple, and could be used on personal data assistants and by motion-impaired computer users.
Resumo:
State-of-the-art large vocabulary continuous speech recognition (LVCSR) systems often combine outputs from multiple subsystems developed at different sites. Cross system adaptation can be used as an alternative to direct hypothesis level combination schemes such as ROVER. In normal cross adaptation it is assumed that useful diversity among systems exists only at acoustic level. However, complimentary features among complex LVCSR systems also manifest themselves in other layers of modelling hierarchy, e.g., subword and word level. It is thus interesting to also cross adapt language models (LM) to capture them. In this paper cross adaptation of multi-level LMs modelling both syllable and word sequences was investigated to improve LVCSR system combination. Significant error rate gains up to 6.7% rel. were obtained over ROVER and acoustic model only cross adaptation when combining 13 Chinese LVCSR subsystems used in the 2010 DARPA GALE evaluation. © 2010 ISCA.