924 resultados para acoustic speech recognition system


Relevância:

100.00% 100.00%

Publicador:

Resumo:

This thesis addresses the viability of automatic speech recognition for control room systems; with careful system design, automatic speech recognition (ASR) devices can be useful means for human computer interaction in specific types of task. These tasks can be defined as complex verbal activities, such as command and control, and can be paired with spatial tasks, such as monitoring, without detriment. It is suggested that ASR use be confined to routine plant operation, as opposed the critical incidents, due to possible problems of stress on the operators' speech.  It is proposed that using ASR will require operators to adapt a commonly used skill to cater for a novel use of speech. Before using the ASR device, new operators will require some form of training. It is shown that a demonstration by an experienced user of the device can lead to superior performance than instructions. Thus, a relatively cheap and very efficient form of operator training can be supplied by demonstration by experienced ASR operators. From a series of studies into speech based interaction with computers, it is concluded that the interaction be designed to capitalise upon the tendency of operators to use short, succinct, task specific styles of speech. From studies comparing different types of feedback, it is concluded that operators be given screen based feedback, rather than auditory feedback, for control room operation. Feedback will take two forms: the use of the ASR device will require recognition feedback, which will be best supplied using text; the performance of a process control task will require task feedback integrated into the mimic display. This latter feedback can be either textual or symbolic, but it is suggested that symbolic feedback will be more beneficial. Related to both interaction style and feedback is the issue of handling recognition errors. These should be corrected by simple command repetition practices, rather than use error handling dialogues. This method of error correction is held to be non intrusive to primary command and control operations. This thesis also addresses some of the problems of user error in ASR use, and provides a number of recommendations for its reduction.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Spectral peak resolution was investigated in normal hearing (NH), hearing impaired (HI), and cochlear implant (CI) listeners. The task involved discriminating between two rippled noise stimuli in which the frequency positions of the log-spaced peaks and valleys were interchanged. The ripple spacing was varied adaptively from 0.13 to 11.31 ripples/octave, and the minimum ripple spacing at which a reversal in peak and trough positions could be detected was determined as the spectral peak resolution threshold for each listener. Spectral peak resolution was best, on average, in NH listeners, poorest in CI listeners, and intermediate for HI listeners. There was a significant relationship between spectral peak resolution and both vowel and consonant recognition in quiet across the three listener groups. The results indicate that the degree of spectral peak resolution required for accurate vowel and consonant recognition in quiet backgrounds is around 4 ripples/octave, and that spectral peak resolution poorer than around 1–2 ripples/octave may result in highly degraded speech recognition. These results suggest that efforts to improve spectral peak resolution for HI and CI users may lead to improved speech recognition

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The purpose of the present study was to examine the benefits of providing audible speech to listeners with sensorineural hearing loss when the speech is presented in a background noise. Previous studies have shown that when listeners have a severe hearing loss in the higher frequencies, providing audible speech (in a quiet background) to these higher frequencies usually results in no improvement in speech recognition. In the present experiments, speech was presented in a background of multitalker babble to listeners with various severities of hearing loss. The signal was low-pass filtered at numerous cutoff frequencies and speech recognition was measured as additional high-frequency speech information was provided to the hearing-impaired listeners. It was found in all cases, regardless of hearing loss or frequency range, that providing audible speech resulted in an increase in recognition score. The change in recognition as the cutoff frequency was increased, along with the amount of audible speech information in each condition (articulation index), was used to calculate the "efficiency" of providing audible speech. Efficiencies were positive for all degrees of hearing loss. However, the gains in recognition were small, and the maximum score obtained by an listener was low, due to the noise background. An analysis of error patterns showed that due to the limited speech audibility in a noise background, even severely impaired listeners used additional speech audibility in the high frequencies to improve their perception of the "easier" features of speech including voicing

Relevância:

100.00% 100.00%

Publicador:

Resumo:

I investigated the genetic relationship between male and female components of the mate recognition system and how this relationship influenced the subsequent evolution of the two traits, in a series of replicate populations of interspecific hybrids. Thirty populations of hybrids between Drosophila serrata and Drosophila birchii were established and maintained for 24 generations. At the fifth generation after hybridization, the mating success of hybrid individuals with the D. serrata parent was determined. The genetic correlation between male and female components of the male recognition system, as a consequence of pleiotropy or tight physical linkage, was found to be significant but low (r = 0.388). This result suggested that pleiotropy may play only a minor role in the evolution of mate recognition in this system. At the twenty-fourth generation after hybridization, the mating success of the hybrids was again determined. The evolution of male and female components was investigated by analyzing the direction of evolution of each hybrid line with respect to its initial position in relation to the genetic regression. Male and female components appeared to converge on a single equilibrium point, rather than evolving along trajectories with slope equal to the genetic regression, toward a line of equilibria.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Sendo uma forma natural de interação homem-máquina, o reconhecimento de gestos implica uma forte componente de investigação em áreas como a visão por computador e a aprendizagem computacional. O reconhecimento gestual é uma área com aplicações muito diversas, fornecendo aos utilizadores uma forma mais natural e mais simples de comunicar com sistemas baseados em computador, sem a necessidade de utilização de dispositivos extras. Assim, o objectivo principal da investigação na área de reconhecimento de gestos aplicada à interacção homemmáquina é o da criação de sistemas, que possam identificar gestos específicos e usálos para transmitir informações ou para controlar dispositivos. Para isso as interfaces baseados em visão para o reconhecimento de gestos, necessitam de detectar a mão de forma rápida e robusta e de serem capazes de efetuar o reconhecimento de gestos em tempo real. Hoje em dia, os sistemas de reconhecimento de gestos baseados em visão são capazes de trabalhar com soluções específicas, construídos para resolver um determinado problema e configurados para trabalhar de uma forma particular. Este projeto de investigação estudou e implementou soluções, suficientemente genéricas, com o recurso a algoritmos de aprendizagem computacional, permitindo a sua aplicação num conjunto alargado de sistemas de interface homem-máquina, para reconhecimento de gestos em tempo real. A solução proposta, Gesture Learning Module Architecture (GeLMA), permite de forma simples definir um conjunto de comandos que pode ser baseado em gestos estáticos e dinâmicos e que pode ser facilmente integrado e configurado para ser utilizado numa série de aplicações. É um sistema de baixo custo e fácil de treinar e usar, e uma vez que é construído unicamente com bibliotecas de código. As experiências realizadas permitiram mostrar que o sistema atingiu uma precisão de 99,2% em termos de reconhecimento de gestos estáticos e uma precisão média de 93,7% em termos de reconhecimento de gestos dinâmicos. Para validar a solução proposta, foram implementados dois sistemas completos. O primeiro é um sistema em tempo real capaz de ajudar um árbitro a arbitrar um jogo de futebol robótico. A solução proposta combina um sistema de reconhecimento de gestos baseada em visão com a definição de uma linguagem formal, o CommLang Referee, à qual demos a designação de Referee Command Language Interface System (ReCLIS). O sistema identifica os comandos baseados num conjunto de gestos estáticos e dinâmicos executados pelo árbitro, sendo este posteriormente enviado para um interface de computador que transmite a respectiva informação para os robôs. O segundo é um sistema em tempo real capaz de interpretar um subconjunto da Linguagem Gestual Portuguesa. As experiências demonstraram que o sistema foi capaz de reconhecer as vogais em tempo real de forma fiável. Embora a solução implementada apenas tenha sido treinada para reconhecer as cinco vogais, o sistema é facilmente extensível para reconhecer o resto do alfabeto. As experiências também permitiram mostrar que a base dos sistemas de interação baseados em visão pode ser a mesma para todas as aplicações e, deste modo facilitar a sua implementação. A solução proposta tem ainda a vantagem de ser suficientemente genérica e uma base sólida para o desenvolvimento de sistemas baseados em reconhecimento gestual que podem ser facilmente integrados com qualquer aplicação de interface homem-máquina. A linguagem formal de definição da interface pode ser redefinida e o sistema pode ser facilmente configurado e treinado com um conjunto de gestos diferentes de forma a serem integrados na solução final.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Biometric recognition has recently emerged as part of applications where the privacy of the information is crucial, as in the health care field. This paper presents a biometric recognition system based on the Electrocardiographic signal (ECG). The proposed system is based on a state-of-the-art recognition method which extracts information from the frequency domain. In this paper we propose a new method to increase the spectral resolution of low bandwidth ECG signals due to the limited bandwidth of the acquisition sensor. Preliminary results show that the proposed scheme reveals a higher identification rate and lower equal error rate when compared to previous approaches.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

"Lecture notes in computational vision and biomechanics series, ISSN 2212-9391, vol. 19"

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Vision-based hand gesture recognition is an area of active current research in computer vision and machine learning. Being a natural way of human interaction, it is an area where many researchers are working on, with the goal of making human computer interaction (HCI) easier and natural, without the need for any extra devices. So, the primary goal of gesture recognition research is to create systems, which can identify specific human gestures and use them, for example, to convey information. For that, vision-based hand gesture interfaces require fast and extremely robust hand detection, and gesture recognition in real time. Hand gestures are a powerful human communication modality with lots of potential applications and in this context we have sign language recognition, the communication method of deaf people. Sign lan- guages are not standard and universal and the grammars differ from country to coun- try. In this paper, a real-time system able to interpret the Portuguese Sign Language is presented and described. Experiments showed that the system was able to reliably recognize the vowels in real-time, with an accuracy of 99.4% with one dataset of fea- tures and an accuracy of 99.6% with a second dataset of features. Although the im- plemented solution was only trained to recognize the vowels, it is easily extended to recognize the rest of the alphabet, being a solid foundation for the development of any vision-based sign language recognition user interface system.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Tese de Doutoramento em Engenharia de Eletrónica e de Computadores

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This project was funded under the Applied Research Grants Scheme administered by Enterprise Ireland. The project was a partnership between Galway - Mayo Institute of Technology and an industrial company, Tyco/Mallinckrodt Galway. The project aimed to develop a semi - automatic, self - learning pattern recognition system capable of detecting defects on the printed circuits boards such as component vacancy, component misalignment, component orientation, component error, and component weld. The research was conducted in three directions: image acquisition, image filtering/recognition and software development. Image acquisition studied the process of forming and digitizing images and some fundamental aspects regarding the human visual perception. The importance of choosing the right camera and illumination system for a certain type of problem has been highlighted. Probably the most important step towards image recognition is image filtering, The filters are used to correct and enhance images in order to prepare them for recognition. Convolution, histogram equalisation, filters based on Boolean mathematics, noise reduction, edge detection, geometrical filters, cross-correlation filters and image compression are some examples of the filters that have been studied and successfully implemented in the software application. The software application developed during the research is customized in order to meet the requirements of the industrial partner. The application is able to analyze pictures, perform the filtering, build libraries, process images and generate log files. It incorporates most of the filters studied and together with the illumination system and the camera it provides a fully integrated framework able to analyze defects on printed circuit boards.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Background: Single Nucleotide Polymorphisms, among other type of sequence variants, constitute key elements in genetic epidemiology and pharmacogenomics. While sequence data about genetic variation is found at databases such as dbSNP, clues about the functional and phenotypic consequences of the variations are generally found in biomedical literature. The identification of the relevant documents and the extraction of the information from them are hampered by the large size of literature databases and the lack of widely accepted standard notation for biomedical entities. Thus, automatic systems for the identification of citations of allelic variants of genes in biomedical texts are required. Results: Our group has previously reported the development of OSIRIS, a system aimed at the retrieval of literature about allelic variants of genes http://ibi.imim.es/osirisform.html. Here we describe the development of a new version of OSIRIS (OSIRISv1.2, http://ibi.imim.es/OSIRISv1.2.html webcite) which incorporates a new entity recognition module and is built on top of a local mirror of the MEDLINE collection and HgenetInfoDB: a database that collects data on human gene sequence variations. The new entity recognition module is based on a pattern-based search algorithm for the identification of variation terms in the texts and their mapping to dbSNP identifiers. The performance of OSIRISv1.2 was evaluated on a manually annotated corpus, resulting in 99% precision, 82% recall, and an F-score of 0.89. As an example, the application of the system for collecting literature citations for the allelic variants of genes related to the diseases intracranial aneurysm and breast cancer is presented. Conclusion: OSIRISv1.2 can be used to link literature references to dbSNP database entries with high accuracy, and therefore is suitable for collecting current knowledge on gene sequence variations and supporting the functional annotation of variation databases. The application of OSIRISv1.2 in combination with controlled vocabularies like MeSH provides a way to identify associations of biomedical interest, such as those that relate SNPs with diseases.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this work we explore the multivariate empirical mode decomposition combined with a Neural Network classifier as technique for face recognition tasks. Images are simultaneously decomposed by means of EMD and then the distance between the modes of the image and the modes of the representative image of each class is calculated using three different distance measures. Then, a neural network is trained using 10- fold cross validation in order to derive a classifier. Preliminary results (over 98 % of classification rate) are satisfactory and will justify a deep investigation on how to apply mEMD for face recognition.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The value of earmarks as an efficient means of personal identification is still subject to debate. It has been argued that the field is lacking a firm systematic and structured data basis to help practitioners to form their conclusions. Typically, there is a paucity of research guiding as to the selectivity of the features used in the comparison process between an earmark and reference earprints taken from an individual. This study proposes a system for the automatic comparison of earprints and earmarks, operating without any manual extraction of key-points or manual annotations. For each donor, a model is created using multiple reference prints, hence capturing the donor within source variability. For each comparison between a mark and a model, images are automatically aligned and a proximity score, based on a normalized 2D correlation coefficient, is calculated. Appropriate use of this score allows deriving a likelihood ratio that can be explored under known state of affairs (both in cases where it is known that the mark has been left by the donor that gave the model and conversely in cases when it is established that the mark originates from a different source). To assess the system performance, a first dataset containing 1229 donors elaborated during the FearID research project was used. Based on these data, for mark-to-print comparisons, the system performed with an equal error rate (EER) of 2.3% and about 88% of marks are found in the first 3 positions of a hitlist. When performing print-to-print transactions, results show an equal error rate of 0.5%. The system was then tested using real-case data obtained from police forces.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

As part of the Affective Computing research field, the development of automatic affective recognition systems can enhance human-computer interactions by allowing the creation of interfaces that react to the user's emotional state. To that end, this Master Thesis brings affect recognition to nowadays most used human computer interface, mobile devices, by developing a facial expression recognition system able to perform detection under the difficult conditions of viewing angle and illumination that entails the interaction with a mobile device. Moreover, this Master Thesis proposes to combine emotional features detected from expression with contextual information of the current situation, to infer a complex and extensive emotional state of the user. Thus, a cognitive computational model of emotion is defined that provides a multicomponential affective state of the user through the integration of the detected emotional features into appraisal processes. In order to account for individual differences in the emotional experience, these processes can be adapted to the culture and personality of the user.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In the modern warfare there is an active development of a new trend connected with a robotic warfare. One of the critical elements of robotics warfare systems is an automatic target recognition system, allowing to recognize objects, based on the data received from sensors. This work considers aspects of optical realization of such a system by means of NIR target scanning at fixed wavelengths. An algorithm was designed, an experimental setup was built and samples of various modern gear and apparel materials were tested. For pattern testing the samples of actively arm engaged armies camouflages were chosen. Tests were performed both in clear atmosphere and in the artificial extremely humid and hot atmosphere to simulate field conditions.