961 resultados para Automatic speech recognition (ASR)


Relevância:

100.00% 100.00%

Publicador:

Resumo:

This work is part of an on-going collaborative project between the medical and signal processing communities to promote new research efforts on automatic OSA (Obstructive Apnea Syndrome) diagnosis. In this paper, we explore the differences noted in phonetic classes (interphoneme) across groups (control/apnoea) and analyze their utility for OSA detection

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper introduces the session on advanced speech recognition technology. The two papers comprising this session argue that current technology yields a performance that is only an order of magnitude in error rate away from human performance and that incremental improvements will bring us to that desired level. I argue that, to the contrary, present performance is far removed from human performance and a revolution in our thinking is required to achieve the goal. It is further asserted that to bring about the revolution more effort should be expended on basic research and less on trying to prematurely commercialize a deficient technology.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Speech recognition involves three processes: extraction of acoustic indices from the speech signal, estimation of the probability that the observed index string was caused by a hypothesized utterance segment, and determination of the recognized utterance via a search among hypothesized alternatives. This paper is not concerned with the first process. Estimation of the probability of an index string involves a model of index production by any given utterance segment (e.g., a word). Hidden Markov models (HMMs) are used for this purpose [Makhoul, J. & Schwartz, R. (1995) Proc. Natl. Acad. Sci. USA 92, 9956-9963]. Their parameters are state transition probabilities and output probability distributions associated with the transitions. The Baum algorithm that obtains the values of these parameters from speech data via their successive reestimation will be described in this paper. The recognizer wishes to find the most probable utterance that could have caused the observed acoustic index string. That probability is the product of two factors: the probability that the utterance will produce the string and the probability that the speaker will wish to produce the utterance (the language model probability). Even if the vocabulary size is moderate, it is impossible to search for the utterance exhaustively. One practical algorithm is described [Viterbi, A. J. (1967) IEEE Trans. Inf. Theory IT-13, 260-267] that, given the index string, has a high likelihood of finding the most probable utterance.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Assistive technology involving voice communication is used primarily by people who are deaf, hard of hearing, or who have speech and/or language disabilities. It is also used to a lesser extent by people with visual or motor disabilities. A very wide range of devices has been developed for people with hearing loss. These devices can be categorized not only by the modality of stimulation [i.e., auditory, visual, tactile, or direct electrical stimulation of the auditory nerve (auditory-neural)] but also in terms of the degree of speech processing that is used. At least four such categories can be distinguished: assistive devices (a) that are not designed specifically for speech, (b) that take the average characteristics of speech into account, (c) that process articulatory or phonetic characteristics of speech, and (d) that embody some degree of automatic speech recognition. Assistive devices for people with speech and/or language disabilities typically involve some form of speech synthesis or symbol generation for severe forms of language disability. Speech synthesis is also used in text-to-speech systems for sightless persons. Other applications of assistive technology involving voice communication include voice control of wheelchairs and other devices for people with mobility disabilities.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Speech interface technology, which includes automatic speech recognition, synthetic speech, and natural language processing, is beginning to have a significant impact on business and personal computer use. Today, powerful and inexpensive microprocessors and improved algorithms are driving commercial applications in computer command, consumer, data entry, speech-to-text, telephone, and voice verification. Robust speaker-independent recognition systems for command and navigation in personal computers are now available; telephone-based transaction and database inquiry systems using both speech synthesis and recognition are coming into use. Large-vocabulary speech interface systems for document creation and read-aloud proofing are expanding beyond niche markets. Today's applications represent a small preview of a rich future for speech interface technology that will eventually replace keyboards with microphones and loud-speakers to give easy accessibility to increasingly intelligent machines.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The role of polarisation in late time complex resonance based target identification is investigated numerically for the case of an L-shaped wire. While repeated extraction of the resonances for varying polarisation allows for better signal-to-noise immunity, it is also found that there are preferred polarisations for each complex resonance. The first few of these polarisations are extracted for the sample target.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The research presented in this paper is part of an ongoing investigation into how best to incorporate speech-based input within mobile data collection applications. In our previous work [1], we evaluated the ability of a single speech recognition engine to support accurate, mobile, speech-based data input. Here, we build on our previous research to compare the achievable speaker-independent accuracy rates of a variety of speech recognition engines; we also consider the relative effectiveness of different speech recognition engine and microphone pairings in terms of their ability to support accurate text entry under realistic mobile conditions of use. Our intent is to provide some initial empirical data derived from mobile, user-based evaluations to support technological decisions faced by developers of mobile applications that would benefit from, or require, speech-based data entry facilities.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Speech recognition technology is regarded as a key enabler for increasing the usability of applications deployed on mobile devices -- devices which are becoming increasingly prevalent in modern hospital-based healthcare. Although the use of speech recognition is not new to the hospital-based healthcare domain, its use with mobile devices has thus far been limited. This paper presents the results of a literature review we conducted in order to observe the manner in which speech recognition technology has been used in hospital-based healthcare and to gain an understanding of how this technology is being evaluated, in terms of its dependability and reliability, in healthcare settings. Our intent is that this review will help identify scope for future uses of speech recognition technologies in the healthcare domain, as well as to identify implications for the meaningful evaluation of such technologies given the specific context of use.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The research presented in this paper is part of an ongoing investigation into how best to incorporate speech-based input within mobile data collection applications. In our previous work [1], we evaluated the ability of a single speech recognition engine to support accurate, mobile, speech-based data input. Here, we build on our previous research to compare the achievable speaker-independent accuracy rates of a variety of speech recognition engines; we also consider the relative effectiveness of different speech recognition engine and microphone pairings in terms of their ability to support accurate text entry under realistic mobile conditions of use. Our intent is to provide some initial empirical data derived from mobile, user-based evaluations to support technological decisions faced by developers of mobile applications that would benefit from, or require, speech-based data entry facilities.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Speech recognition technology is regarded as a key enabler for increasing the usability of applications deployed on mobile devices -- devices which are becoming increasingly prevalent in modern hospital-based healthcare. Although the use of speech recognition is not new to the hospital-based healthcare domain, its use with mobile devices has thus far been limited. This paper presents the results of a literature review we conducted in order to observe the manner in which speech recognition technology has been used in hospital-based healthcare and to gain an understanding of how this technology is being evaluated, in terms of its dependability and reliability, in healthcare settings. Our intent is that this review will help identify scope for future uses of speech recognition technologies in the healthcare domain, as well as to identify implications for the meaningful evaluation of such technologies given the specific context of use.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper, we propose a speech recognition engine using hybrid model of Hidden Markov Model (HMM) and Gaussian Mixture Model (GMM). Both the models have been trained independently and the respective likelihood values have been considered jointly and input to a decision logic which provides net likelihood as the output. This hybrid model has been compared with the HMM model. Training and testing has been done by using a database of 20 Hindi words spoken by 80 different speakers. Recognition rates achieved by normal HMM are 83.5% and it gets increased to 85% by using the hybrid approach of HMM and GMM.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Mémoire numérisé par la Direction des bibliothèques de l'Université de Montréal.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Mémoire numérisé par la Direction des bibliothèques de l'Université de Montréal.