52 resultados para automatic speech recognition systems

em Aston University Research Archive


Relevância:

100.00% 100.00%

Publicador:

Resumo:

This thesis addresses the viability of automatic speech recognition for control room systems; with careful system design, automatic speech recognition (ASR) devices can be useful means for human computer interaction in specific types of task. These tasks can be defined as complex verbal activities, such as command and control, and can be paired with spatial tasks, such as monitoring, without detriment. It is suggested that ASR use be confined to routine plant operation, as opposed the critical incidents, due to possible problems of stress on the operators' speech.  It is proposed that using ASR will require operators to adapt a commonly used skill to cater for a novel use of speech. Before using the ASR device, new operators will require some form of training. It is shown that a demonstration by an experienced user of the device can lead to superior performance than instructions. Thus, a relatively cheap and very efficient form of operator training can be supplied by demonstration by experienced ASR operators. From a series of studies into speech based interaction with computers, it is concluded that the interaction be designed to capitalise upon the tendency of operators to use short, succinct, task specific styles of speech. From studies comparing different types of feedback, it is concluded that operators be given screen based feedback, rather than auditory feedback, for control room operation. Feedback will take two forms: the use of the ASR device will require recognition feedback, which will be best supplied using text; the performance of a process control task will require task feedback integrated into the mimic display. This latter feedback can be either textual or symbolic, but it is suggested that symbolic feedback will be more beneficial. Related to both interaction style and feedback is the issue of handling recognition errors. These should be corrected by simple command repetition practices, rather than use error handling dialogues. This method of error correction is held to be non intrusive to primary command and control operations. This thesis also addresses some of the problems of user error in ASR use, and provides a number of recommendations for its reduction.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Many attempts have been made to overcome problems involved in character recognition which have resulted in the manufacture of character reading machines. An investigation into a new approach to character recognition is described. Features for recognition are Fourier coefficients. These are generated optically by convolving characters with periodic gratings. The development of hardware to enable automatic measurement of contrast and position of periodic shadows produced by the convolution is described. Fourier coefficients of character sets were measured, many of which are tabulated. Their analysis revealed that a few low frequency sampling points could be selected to recognise sets of numerals. Limited treatment is given to show the effect of type face variations on the values of coefficients which culminated in the location of six sampling frequencies used as features to recognise numerals in two type fonts. Finally, the construction of two character recognition machines is compared and contrasted. The first is a pilot plant based on a test bed optical Fourier analyser, while the second is a more streamlined machine d(3signed for high speed reading. Reasons to indicate that the latter machine would be the most suitable to adapt for industrial and commercial applications are discussed.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The research presented in this paper is part of an ongoing investigation into how best to incorporate speech-based input within mobile data collection applications. In our previous work [1], we evaluated the ability of a single speech recognition engine to support accurate, mobile, speech-based data input. Here, we build on our previous research to compare the achievable speaker-independent accuracy rates of a variety of speech recognition engines; we also consider the relative effectiveness of different speech recognition engine and microphone pairings in terms of their ability to support accurate text entry under realistic mobile conditions of use. Our intent is to provide some initial empirical data derived from mobile, user-based evaluations to support technological decisions faced by developers of mobile applications that would benefit from, or require, speech-based data entry facilities.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The research presented in this paper is part of an ongoing investigation into how best to incorporate speech-based input within mobile data collection applications. In our previous work [1], we evaluated the ability of a single speech recognition engine to support accurate, mobile, speech-based data input. Here, we build on our previous research to compare the achievable speaker-independent accuracy rates of a variety of speech recognition engines; we also consider the relative effectiveness of different speech recognition engine and microphone pairings in terms of their ability to support accurate text entry under realistic mobile conditions of use. Our intent is to provide some initial empirical data derived from mobile, user-based evaluations to support technological decisions faced by developers of mobile applications that would benefit from, or require, speech-based data entry facilities.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Speech recognition technology is regarded as a key enabler for increasing the usability of applications deployed on mobile devices -- devices which are becoming increasingly prevalent in modern hospital-based healthcare. Although the use of speech recognition is not new to the hospital-based healthcare domain, its use with mobile devices has thus far been limited. This paper presents the results of a literature review we conducted in order to observe the manner in which speech recognition technology has been used in hospital-based healthcare and to gain an understanding of how this technology is being evaluated, in terms of its dependability and reliability, in healthcare settings. Our intent is that this review will help identify scope for future uses of speech recognition technologies in the healthcare domain, as well as to identify implications for the meaningful evaluation of such technologies given the specific context of use.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Speech recognition technology is regarded as a key enabler for increasing the usability of applications deployed on mobile devices -- devices which are becoming increasingly prevalent in modern hospital-based healthcare. Although the use of speech recognition is not new to the hospital-based healthcare domain, its use with mobile devices has thus far been limited. This paper presents the results of a literature review we conducted in order to observe the manner in which speech recognition technology has been used in hospital-based healthcare and to gain an understanding of how this technology is being evaluated, in terms of its dependability and reliability, in healthcare settings. Our intent is that this review will help identify scope for future uses of speech recognition technologies in the healthcare domain, as well as to identify implications for the meaningful evaluation of such technologies given the specific context of use.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

OBJECTIVE: Cochlear implantation (CI) is a standard treatment for severe-profound sensorineural hearing loss (SNHL). However, consensus has yet to be reached on its effectiveness for hearing loss caused by auditory neuropathy spectrum disorder (ANSD). This review aims to summarize and synthesize current evidence of the effectiveness of CI in improving speech recognition in children with ANSD. DESIGN: Systematic review. STUDY SAMPLE: A total of 27 studies from an initial selection of 237. RESULTS: All selected studies were observational in design, including case studies, cohort studies, and comparisons between children with ANSD and SNHL. Most children with ANSD achieved open-set speech recognition with their CI. Speech recognition ability was found to be equivalent in CI users (who previously performed poorly with hearing aids) and hearing-aid users. Outcomes following CI generally appeared similar in children with ANSD and SNHL. Assessment of study quality, however, suggested substantial methodological concerns, particularly in relation to issues of bias and confounding, limiting the robustness of any conclusions around effectiveness. CONCLUSIONS: Currently available evidence is compatible with favourable outcomes from CI in children with ANSD. However, this evidence is weak. Stronger evidence is needed to support cost-effective clinical policy and practice in this area.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Automatic Term Recognition (ATR) is a fundamental processing step preceding more complex tasks such as semantic search and ontology learning. From a large number of methodologies available in the literature only a few are able to handle both single and multi-word terms. In this paper we present a comparison of five such algorithms and propose a combined approach using a voting mechanism. We evaluated the six approaches using two different corpora and show how the voting algorithm performs best on one corpus (a collection of texts from Wikipedia) and less well using the Genia corpus (a standard life science corpus). This indicates that choice and design of corpus has a major impact on the evaluation of term recognition algorithms. Our experiments also showed that single-word terms can be equally important and occupy a fairly large proportion in certain domains. As a result, algorithms that ignore single-word terms may cause problems to tasks built on top of ATR. Effective ATR systems also need to take into account both the unstructured text and the structured aspects and this means information extraction techniques need to be integrated into the term recognition process.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper reviews some basic issues and methods involved in using neural networks to respond in a desired fashion to a temporally-varying environment. Some popular network models and training methods are introduced. A speech recognition example is then used to illustrate the central difficulty of temporal data processing: learning to notice and remember relevant contextual information. Feedforward network methods are applicable to cases where this problem is not severe. The application of these methods are explained and applications are discussed in the areas of pure mathematics, chemical and physical systems, and economic systems. A more powerful but less practical algorithm for temporal problems, the moving targets algorithm, is sketched and discussed. For completeness, a few remarks are made on reinforcement learning.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Conventionally, biometrics resources, such as face, gait silhouette, footprint, and pressure, have been utilized in gender recognition systems. However, the acquisition and processing time of these biometrics data makes the analysis difficult. This letter demonstrates for the first time how effective the footwear appearance is for gender recognition as a biometrics resource. A footwear database is also established with reprehensive shoes (footwears). Preliminary experimental results suggest that footwear appearance is a promising resource for gender recognition. Moreover, it also has the potential to be used jointly with other developed biometrics resources to boost performance.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper discusses the first of three studies which collectively represent a convergence of two ongoing research agendas: (1) the empirically-based comparison of the effects of evaluation environment on mobile usability evaluation results; and (2) the effect of environment - in this case lobster fishing boats - on achievable speech-recognition accuracy. We describe, in detail, our study and outline our results to date based on preliminary analysis. Broadly speaking, the potential for effective use of speech for data collection and vessel control looks very promising - surprisingly so! We outline our ongoing analysis and further work.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper discusses the first of three studies which collectively represent a convergence of two ongoing research agendas: (1) the empirically-based comparison of the effects of evaluation environment on mobile usability evaluation results; and (2) the effect of environment - in this case lobster fishing boats - on achievable speech-recognition accuracy. We describe, in detail, our study and outline our results to date based on preliminary analysis. Broadly speaking, the potential for effective use of speech for data collection and vessel control looks very promising - surprisingly so! We outline our ongoing analysis and further work.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

N-tuple recognition systems (RAMnets) are normally modeled using a small number of input lines to each RAM, because the address space grows exponentially with the number of inputs. It is impossible to implement an arbitrarily-large address space as physical memory. But given modest amounts of training data, correspondingly modest numbers of bits will be set in that memory. Hash arrays can therefore be used instead of a direct implementation of the required address space. This paper describes some exploratory experiments using the hash array technique to investigate the performance of RAMnets with very large numbers of input lines. An argument is presented which concludes that performance should peak at a relatively small n-tuple size, but the experiments carried out so far contradict this. Further experiments are needed to confirm this unexpected result.