22 resultados para Audio-Visual Automatic Speech Recognition

em Aston University Research Archive


Relevância:

100.00% 100.00%

Publicador:

Resumo:

This thesis addresses the viability of automatic speech recognition for control room systems; with careful system design, automatic speech recognition (ASR) devices can be useful means for human computer interaction in specific types of task. These tasks can be defined as complex verbal activities, such as command and control, and can be paired with spatial tasks, such as monitoring, without detriment. It is suggested that ASR use be confined to routine plant operation, as opposed the critical incidents, due to possible problems of stress on the operators' speech.  It is proposed that using ASR will require operators to adapt a commonly used skill to cater for a novel use of speech. Before using the ASR device, new operators will require some form of training. It is shown that a demonstration by an experienced user of the device can lead to superior performance than instructions. Thus, a relatively cheap and very efficient form of operator training can be supplied by demonstration by experienced ASR operators. From a series of studies into speech based interaction with computers, it is concluded that the interaction be designed to capitalise upon the tendency of operators to use short, succinct, task specific styles of speech. From studies comparing different types of feedback, it is concluded that operators be given screen based feedback, rather than auditory feedback, for control room operation. Feedback will take two forms: the use of the ASR device will require recognition feedback, which will be best supplied using text; the performance of a process control task will require task feedback integrated into the mimic display. This latter feedback can be either textual or symbolic, but it is suggested that symbolic feedback will be more beneficial. Related to both interaction style and feedback is the issue of handling recognition errors. These should be corrected by simple command repetition practices, rather than use error handling dialogues. This method of error correction is held to be non intrusive to primary command and control operations. This thesis also addresses some of the problems of user error in ASR use, and provides a number of recommendations for its reduction.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The research presented in this paper is part of an ongoing investigation into how best to incorporate speech-based input within mobile data collection applications. In our previous work [1], we evaluated the ability of a single speech recognition engine to support accurate, mobile, speech-based data input. Here, we build on our previous research to compare the achievable speaker-independent accuracy rates of a variety of speech recognition engines; we also consider the relative effectiveness of different speech recognition engine and microphone pairings in terms of their ability to support accurate text entry under realistic mobile conditions of use. Our intent is to provide some initial empirical data derived from mobile, user-based evaluations to support technological decisions faced by developers of mobile applications that would benefit from, or require, speech-based data entry facilities.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Speech recognition technology is regarded as a key enabler for increasing the usability of applications deployed on mobile devices -- devices which are becoming increasingly prevalent in modern hospital-based healthcare. Although the use of speech recognition is not new to the hospital-based healthcare domain, its use with mobile devices has thus far been limited. This paper presents the results of a literature review we conducted in order to observe the manner in which speech recognition technology has been used in hospital-based healthcare and to gain an understanding of how this technology is being evaluated, in terms of its dependability and reliability, in healthcare settings. Our intent is that this review will help identify scope for future uses of speech recognition technologies in the healthcare domain, as well as to identify implications for the meaningful evaluation of such technologies given the specific context of use.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The research presented in this paper is part of an ongoing investigation into how best to incorporate speech-based input within mobile data collection applications. In our previous work [1], we evaluated the ability of a single speech recognition engine to support accurate, mobile, speech-based data input. Here, we build on our previous research to compare the achievable speaker-independent accuracy rates of a variety of speech recognition engines; we also consider the relative effectiveness of different speech recognition engine and microphone pairings in terms of their ability to support accurate text entry under realistic mobile conditions of use. Our intent is to provide some initial empirical data derived from mobile, user-based evaluations to support technological decisions faced by developers of mobile applications that would benefit from, or require, speech-based data entry facilities.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Speech recognition technology is regarded as a key enabler for increasing the usability of applications deployed on mobile devices -- devices which are becoming increasingly prevalent in modern hospital-based healthcare. Although the use of speech recognition is not new to the hospital-based healthcare domain, its use with mobile devices has thus far been limited. This paper presents the results of a literature review we conducted in order to observe the manner in which speech recognition technology has been used in hospital-based healthcare and to gain an understanding of how this technology is being evaluated, in terms of its dependability and reliability, in healthcare settings. Our intent is that this review will help identify scope for future uses of speech recognition technologies in the healthcare domain, as well as to identify implications for the meaningful evaluation of such technologies given the specific context of use.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper we discuss how an innovative audio-visual project was adopted to foster active, rather than declarative learning, in critical International Relations (IR). First, we explore the aesthetic turn in IR, to contrast this with forms of representation that have dominated IR scholarship. Second, we describe how students were asked to record short audio or video projects to explore their own insights through aesthetic and non-written formats. Third, we explain how these projects are understood to be deeply embedded in social science methodologies. We cite our inspiration from applying a personal sociological imagination, as a way to counterbalance a ‘marketised’ slant in higher education, in a global economy where students are often encouraged to consume, rather than produce knowledge. Finally, we draw conclusions in terms of deeper forms of student engagement leading to new ways of thinking and presenting new skills and new connections between theory and practice.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

OBJECTIVE: Cochlear implantation (CI) is a standard treatment for severe-profound sensorineural hearing loss (SNHL). However, consensus has yet to be reached on its effectiveness for hearing loss caused by auditory neuropathy spectrum disorder (ANSD). This review aims to summarize and synthesize current evidence of the effectiveness of CI in improving speech recognition in children with ANSD. DESIGN: Systematic review. STUDY SAMPLE: A total of 27 studies from an initial selection of 237. RESULTS: All selected studies were observational in design, including case studies, cohort studies, and comparisons between children with ANSD and SNHL. Most children with ANSD achieved open-set speech recognition with their CI. Speech recognition ability was found to be equivalent in CI users (who previously performed poorly with hearing aids) and hearing-aid users. Outcomes following CI generally appeared similar in children with ANSD and SNHL. Assessment of study quality, however, suggested substantial methodological concerns, particularly in relation to issues of bias and confounding, limiting the robustness of any conclusions around effectiveness. CONCLUSIONS: Currently available evidence is compatible with favourable outcomes from CI in children with ANSD. However, this evidence is weak. Stronger evidence is needed to support cost-effective clinical policy and practice in this area.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Automatic Term Recognition (ATR) is a fundamental processing step preceding more complex tasks such as semantic search and ontology learning. From a large number of methodologies available in the literature only a few are able to handle both single and multi-word terms. In this paper we present a comparison of five such algorithms and propose a combined approach using a voting mechanism. We evaluated the six approaches using two different corpora and show how the voting algorithm performs best on one corpus (a collection of texts from Wikipedia) and less well using the Genia corpus (a standard life science corpus). This indicates that choice and design of corpus has a major impact on the evaluation of term recognition algorithms. Our experiments also showed that single-word terms can be equally important and occupy a fairly large proportion in certain domains. As a result, algorithms that ignore single-word terms may cause problems to tasks built on top of ATR. Effective ATR systems also need to take into account both the unstructured text and the structured aspects and this means information extraction techniques need to be integrated into the term recognition process.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper discusses the first of three studies which collectively represent a convergence of two ongoing research agendas: (1) the empirically-based comparison of the effects of evaluation environment on mobile usability evaluation results; and (2) the effect of environment - in this case lobster fishing boats - on achievable speech-recognition accuracy. We describe, in detail, our study and outline our results to date based on preliminary analysis. Broadly speaking, the potential for effective use of speech for data collection and vessel control looks very promising - surprisingly so! We outline our ongoing analysis and further work.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper discusses the first of three studies which collectively represent a convergence of two ongoing research agendas: (1) the empirically-based comparison of the effects of evaluation environment on mobile usability evaluation results; and (2) the effect of environment - in this case lobster fishing boats - on achievable speech-recognition accuracy. We describe, in detail, our study and outline our results to date based on preliminary analysis. Broadly speaking, the potential for effective use of speech for data collection and vessel control looks very promising - surprisingly so! We outline our ongoing analysis and further work.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper reviews some basic issues and methods involved in using neural networks to respond in a desired fashion to a temporally-varying environment. Some popular network models and training methods are introduced. A speech recognition example is then used to illustrate the central difficulty of temporal data processing: learning to notice and remember relevant contextual information. Feedforward network methods are applicable to cases where this problem is not severe. The application of these methods are explained and applications are discussed in the areas of pure mathematics, chemical and physical systems, and economic systems. A more powerful but less practical algorithm for temporal problems, the moving targets algorithm, is sketched and discussed. For completeness, a few remarks are made on reinforcement learning.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Both attentional difficulties and rapid processing deficits have recently been linked with dyslexia. We report two studies comparing the performance of dyslexic and control teenagers on attentional tasks. The two studies were based on two different conceptions of attention. Study 1 employed a design that allowed three key components of attention - focusing, switching, and sustaining - to be investigated separately. One hypothesis under investigation was that rapid processing problems - in particular impaired ability to switch attention rapidly - might be associated with dyslexia. However, although dyslexic participants were significantly less accurate than their controls in a condition where they had to switch attention between two target types, the nature of the deficit suggested that the problem was not in switching attention per se. Thus, in Study 2, we explored an alternative interpretation of the Study 1 results in terms of the classic capacity-limited models of "central" attention. We contrasted two hypotheses: (1) that dyslexic teenagers have reduced cognitive resources versus (2) that they suffer from a general impairment in the ability to automatise basic skills. To investigate the automaticity of the shape recognition component of the task a similar attention paradigm to that used in Study 1 was employed, but using degraded, as well as intact, stimuli. It was found that stimulus degradation led to relatively less impairment for dyslexic than for matched control groups. The results support the hypothesis that dyslexic people suffer from a general impairment in the ability to automatise skills - in this case the skill of automatic shape recognition.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This chapter provides the theoretical foundation and background on data envelopment analysis (DEA) method. We first introduce the basic DEA models. The balance of this chapter focuses on evidences showing DEA has been extensively applied for measuring efficiency and productivity of services including financial services (banking, insurance, securities, and fund management), professional services, health services, education services, environmental and public services, energy services, logistics, tourism, information technology, telecommunications, transport, distribution, audio-visual, media, entertainment, cultural and other business services. Finally, we provide information on the use of Performance Improvement Management Software (PIM-DEA). A free limited version of this software and downloading procedure is also included in this chapter.