198 resultados para free speech


Relevância:

20.00% 20.00%

Publicador:

Resumo:

We describe a method for text entry based on inverse arithmetic coding that relies on gaze direction and which is faster and more accurate than using an on-screen keyboard. These benefits are derived from two innovations: the writing task is matched to the capabilities of the eye, and a language model is used to make predictable words and phrases easier to write.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper investigates unsupervised test-time adaptation of language models (LM) using discriminative methods for a Mandarin broadcast speech transcription and translation task. A standard approach to adapt interpolated language models to is to optimize the component weights by minimizing the perplexity on supervision data. This is a widely made approximation for language modeling in automatic speech recognition (ASR) systems. For speech translation tasks, it is unclear whether a strong correlation still exists between perplexity and various forms of error cost functions in recognition and translation stages. The proposed minimum Bayes risk (MBR) based approach provides a flexible framework for unsupervised LM adaptation. It generalizes to a variety of forms of recognition and translation error metrics. LM adaptation is performed at the audio document level using either the character error rate (CER), or translation edit rate (TER) as the cost function. An efficient parameter estimation scheme using the extended Baum-Welch (EBW) algorithm is proposed. Experimental results on a state-of-the-art speech recognition and translation system are presented. The MBR adapted language models gave the best recognition and translation performance and reduced the TER score by up to 0.54% absolute. © 2007 IEEE.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The Chinese language is based on characters which are syllabic in nature. Since languages have syllabotactic rules which govern the construction of syllables and their allowed sequences, Chinese character sequence models can be used as a first level approximation of allowed syllable sequences. N-gram character sequence models were trained on 4.3 billion characters. Characters are used as a first level recognition unit with multiple pronunciations per character. For comparison the CU-HTK Mandarin word based system was used to recognize words which were then converted to character sequences. The character only system error rates for one best recognition were slightly worse than word based character recognition. However combining the two systems using log-linear combination gives better results than either system separately. An equally weighted combination gave consistent CER gains of 0.1-0.2% absolute over the word based standard system. Copyright © 2009 ISCA.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The high-field properties of polycrystalline superconducting TlBaCaCuO films fabricated by the incorporation of thallium vapour into air-atomised BaCaCuO precursors are described. Thick films with Tc values in the range 106-111 K have been prepared on polycrystalline yttria-stabilised zirconia substrates. The surface morphology, crystal structure and composition of the films are related to their high-field transport and magnetisation properties. Typical 10 mm × 9 mm films show Jc values > 1×104 A/cm2 at 77 K (0 T). The best film has a Jc=1.3×104 A/cm2 (Ic=3.6 A) at 77 K (0 T). Films prepared on 26 mm×9 mm substrates show typical large-area Jc values > 0.5×104 A/cm2 (77 K, 0 T). A square planar specimen of dimensions 4.3 mm ×4.3 mm exhibited magnetisation Jc values=1.2×105 A/cm2 at 4.2 K (0.1 T), 9.3×104 A/cm2 at 10 K (0.1 T), 3.3×104 A/ cm2 at 4 K (8 T), and 1.6×104 A/cm2 at 10 K (8 T). © 1994.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Statistical model-based methods are presented for the reconstruction of autocorrelated signals in impulsive plus continuous noise environments. Signals are modelled as autoregressive and noise sources as discrete and continuous mixtures of Gaussians, allowing for robustness in highly impulsive and non-Gaussian environments. Markov Chain Monte Carlo methods are used for reconstruction of the corrupted waveforms within a Bayesian probabilistic framework and results are presented for contaminated voice and audio signals.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper methods are developed for enhancement and analysis of autoregressive moving average (ARMA) signals observed in additive noise which can be represented as mixtures of heavy-tailed non-Gaussian sources and a Gaussian background component. Such models find application in systems such as atmospheric communications channels or early sound recordings which are prone to intermittent impulse noise. Markov Chain Monte Carlo (MCMC) simulation techniques are applied to the joint problem of signal extraction, model parameter estimation and detection of impulses within a fully Bayesian framework. The algorithms require only simple linear iterations for all of the unknowns, including the MA parameters, which is in contrast with existing MCMC methods for analysis of noise-free ARMA models. The methods are illustrated using synthetic data and noise-degraded sound recordings.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A study has been performed of the erosion of aluminium by silica sand particles at a velocity of 4.5 m s-1, both air-borne and in the form of a water-borne slurry. Measurements made under similar experimental conditions show that slurry erosion proceeds at a rate several times that of air-borne erosion, the ratio of the two rates depending strongly on the angle of impact. Sand particles become embedded into the metal surface during air-borne particle erosion, forming a composite layer of metal and silica, and provide the major cause of the difference in wear rate. The embedded particles giving rise to surface hardening and a significant reduction in the erosion rate. Embedment of erodent particles was not observed during slurry erosion. Lubrication of the impacting interfaces by water appears to have minimal effect on the wear of aluminium by slurry erosion.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper describes the development of the CU-HTK Mandarin Speech-To-Text (STT) system and assesses its performance as part of a transcription-translation pipeline which converts broadcast Mandarin audio into English text. Recent improvements to the STT system are described and these give Character Error Rate (CER) gains of 14.3% absolute for a Broadcast Conversation (BC) task and 5.1% absolute for a Broadcast News (BN) task. The output of these STT systems is then post-processed, so that it consists of sentence-like segments, and translated into English text using a Statistical Machine Translation (SMT) system. The performance of the transcription-translation pipeline is evaluated using the Translation Edit Rate (TER) and BLEU metrics. It is shown that improving both the STT system and the post-STT segmentations can lower the TER scores by up to 5.3% absolute and increase the BLEU scores by up to 2.7% absolute. © 2007 IEEE.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper discusses the Cambridge University HTK (CU-HTK) system for the automatic transcription of conversational telephone speech. A detailed discussion of the most important techniques in front-end processing, acoustic modeling and model training, language and pronunciation modeling are presented. These include the use of conversation side based cepstral normalization, vocal tract length normalization, heteroscedastic linear discriminant analysis for feature projection, minimum phone error training and speaker adaptive training, lattice-based model adaptation, confusion network based decoding and confidence score estimation, pronunciation selection, language model interpolation, and class based language models. The transcription system developed for participation in the 2002 NIST Rich Transcription evaluations of English conversational telephone speech data is presented in detail. In this evaluation the CU-HTK system gave an overall word error rate of 23.9%, which was the best performance by a statistically significant margin. Further details on the derivation of faster systems with moderate performance degradation are discussed in the context of the 2002 CU-HTK 10 × RT conversational speech transcription system. © 2005 IEEE.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Reconfigurable shutter-based free-space optical switching technologies using fiber ribbon and multiple wavelengths per fiber for Storage Area Networks (SANs) application are presented and demonstrated. ©2009 SPIE-OSA-IEEE.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Reconfigurable shutter-based free-space optical switching technologies using fiber ribbon and multiple wavelengths per fiber for Storage Area Networks (SANs) application are presented and demonstrated. ©2009 Optical Society of America.