Biblioteca Digital

21 resultados para Letters in word recognition

em Cambridge University Engineering Department Publications Database

Modified Kanerva model. Results for real time word recognition

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper describes results obtained using the modified Kanerva model to perform word recognition in continuous speech after being trained on the multi-speaker Alvey 'Hotel' speech corpus. Theoretical discoveries have recently enabled us to increase the speed of execution of part of the model by two orders of magnitude over that previously reported by Prager & Fallside. The memory required for the operation of the model has been similarly reduced. The recognition accuracy reaches 95% without syntactic constraints when tested on different data from seven trained speakers. Real time simulation of a model with 9,734 active units is now possible in both training and recognition modes using the Alvey PARSIFAL transputer array. The modified Kanerva model is a static network consisting of a fixed nonlinear mapping (location matching) followed by a single layer of conventional adaptive links. A section of preprocessed speech is transformed by the non-linear mapping to a high dimensional representation. From this intermediate representation a simple linear mapping is able to perform complex pattern discrimination to form the output, indicating the nature of the speech features present in the input window.

On the Benefits of Confidence Visualization in Speech Recognition

Relevância:

100.00% 100.00%

Publicador:

Task dependent loss functions in speech recognition: A-star search over recognition lattices

Relevância:

100.00% 100.00%

Publicador:

Task dependent loss functions in speech recognition: application to named entity extraction

Relevância:

100.00% 100.00%

Publicador:

Neurocontrol in sequence recognition

Relevância:

100.00% 100.00%

Publicador:

The application of hidden Markov models in speech recognition

Relevância:

100.00% 100.00%

Publicador:

SYNTHESIS BY RULE OF PROSODIC FEATURES IN WORD CONCATENATION SYNTHESIS

Relevância:

100.00% 100.00%

Publicador:

THE MODIFIED KANERVA MODEL - THEORY AND RESULTS FOR REAL-TIME WORD RECOGNITION

Relevância:

100.00% 100.00%

Publicador:

Finite-state models, event logics and statistics in speech recognition - Discussion

Relevância:

100.00% 100.00%

Publicador:

Experiments in word-reordering and morphological preprocessing for transducer-based statistical machine translation

Relevância:

100.00% 100.00%

Publicador:

SPEECH RECOGNITION IN VODIS II.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

VODIS II, a research system in which recognition is based on the conventional one-pass connected-word algorithm extended in two ways, is described. Syntactic constraints can now be applied directly via context-free-grammar rules, and the algorithm generates a lattice of candidate word matches rather than a single globally optimal sequence. This lattice is then processed by a chart parser and an intelligent dialogue controller to obtain the most plausible interpretations of the input. A key feature of the VODIS II architecture is that the concept of an abstract word model allows the system to be used with different pattern-matching technologies and hardware. The current system implements the word models on a real-time dynamic-time-warping recognizer.

The use of syntax and multiple alternatives in the VODIS voice operated database inquiry system

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The paper describes the architecture of VODIS, a voice operated database inquiry system, and presents some experiments which investigate the effects on performance of varying the level of a priori syntactic constraints. The VODIS system includes a novel mechanism for incorporating context-free grammatical constraints directly into the word recognition algorithm. This allows the degree of a priori constraint to be smoothly varied and provides for the controlled generation of multiple alternatives. The results show that when the spoken input deviates from the predefined task grammar, a combination of weak a priori syntax rules in conjunction with full a posteriori parsing on a lattice of alternative word matches provides the most robust recognition performance. © 1991.

Language model combination and adaptation using weighted finite state transducers

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In speech recognition systems language model (LMs) are often constructed by training and combining multiple n-gram models. They can be either used to represent different genres or tasks found in diverse text sources, or capture stochastic properties of different linguistic symbol sequences, for example, syllables and words. Unsupervised LM adaptation may also be used to further improve robustness to varying styles or tasks. When using these techniques, extensive software changes are often required. In this paper an alternative and more general approach based on weighted finite state transducers (WFSTs) is investigated for LM combination and adaptation. As it is entirely based on well-defined WFST operations, minimum change to decoding tools is needed. A wide range of LM combination configurations can be flexibly supported. An efficient on-the-fly WFST decoding algorithm is also proposed. Significant error rate gains of 7.3% relative were obtained on a state-of-the-art broadcast audio recognition task using a history dependently adapted multi-level LM modelling both syllable and word sequences. ©2010 IEEE.

Minimum Bayes-risk automatic speech recognition

Relevância:

100.00% 100.00%

Publicador:

Applications of stochastic context-free grammars using the Inside-Outside algorithm

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper describes two applications in speech recognition of the use of stochastic context-free grammars (SCFGs) trained automatically via the Inside-Outside Algorithm. First, SCFGs are used to model VQ encoded speech for isolated word recognition and are compared directly to HMMs used for the same task. It is shown that SCFGs can model this low-level VQ data accurately and that a regular grammar based pre-training algorithm is effective both for reducing training time and obtaining robust solutions. Second, an SCFG is inferred from a transcription of the speech used to train a phoneme-based recognizer in an attempt to model phonotactic constraints. When used as a language model, this SCFG gives improved performance over a comparable regular grammar or bigram. © 1991.

«
1
2
»