76 resultados para audio-visual automatic speech recognition


Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper we shed light over the problem of landslide automatic recognition using supervised classification, and we also introduced the OPF classifier in this context. We employed two images acquired from Geoeye-MS satellite at March-2010 in the northwest (high steep areas) and north sides (pipeline area) covering the area of Duque de Caxias city, Rio de Janeiro State, Brazil. The landslide recognition rate has been assessed through a cross-validation with 10 runnings. In regard to the classifiers, we have used OPF against SVM with Radial Basis Function for kernel mapping and a Bayesian classifier. We can conclude that OPF, Bayes and SVM achieved high recognition rates, being OPF the fastest approach. © 2012 IEEE.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this letter, a speech recognition algorithm based on the least-squares method is presented. Particularly, the intention is to exemplify how such a traditional numerical technique can be applied to solve a signal processing problem that is usually treated by using more elaborated formulations.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In many movies of scientific fiction, machines were capable of speaking with humans. However mankind is still far away of getting those types of machines, like the famous character C3PO of Star Wars. During the last six decades the automatic speech recognition systems have been the target of many studies. Throughout these years many technics were developed to be used in applications of both software and hardware. There are many types of automatic speech recognition system, among which the one used in this work were the isolated word and independent of the speaker system, using Hidden Markov Models as the recognition system. The goals of this work is to project and synthesize the first two steps of the speech recognition system, the steps are: the speech signal acquisition and the pre-processing of the signal. Both steps were developed in a reprogrammable component named FPGA, using the VHDL hardware description language, owing to the high performance of this component and the flexibility of the language. In this work it is presented all the theory of digital signal processing, as Fast Fourier Transforms and digital filters and also all the theory of speech recognition using Hidden Markov Models and LPC processor. It is also presented all the results obtained for each one of the blocks synthesized e verified in hardware

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The applications of Automatic Vowel Recognition (AVR), which is a sub-part of fundamental importance in most of the speech processing systems, vary from automatic interpretation of spoken language to biometrics. State-of-the-art systems for AVR are based on traditional machine learning models such as Artificial Neural Networks (ANNs) and Support Vector Machines (SVMs), however, such classifiers can not deal with efficiency and effectiveness at the same time, existing a gap to be explored when real-time processing is required. In this work, we present an algorithm for AVR based on the Optimum-Path Forest (OPF), which is an emergent pattern recognition technique recently introduced in literature. Adopting a supervised training procedure and using speech tags from two public datasets, we observed that OPF has outperformed ANNs, SVMs, plus other classifiers, in terms of training time and accuracy. ©2010 IEEE.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This article is inserted in a study aimed at the identification of the main barriers for the inclusion of visually-impaired students in Physics classes. It focuses on the understanding of the communication context which facilitates or hardens the effective participation of students with visual impairment in Mechanics activities. To do so, the research defines, from empirical - sensory and semantic structures, the language to be applied in the activities, as well as, the moment and the speech pattern in which the languages have been used. As a result, it identifies the rela tion between the uses of the interdependent audio-visual empirical lan guage structure in the non-interactive episodes of authority; the decrease in the use of this structure in interactive episodes; the creation of educa tional segregation environments within the classroom and the frequent use of the interdependent tactile-hearing empirical language structure in such environments.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This article represents a continuation of the results of a research presented in Camargo and Nardi (2007). It is inserted in the study that seeks to understand the main student’s inclusion barriers with visual impairment in the Physics classes. It aims to understand which communication context shows kindness or unkindness to the impairment visual student’s real participation in thermology activities. For this, the research defines, from the empirical - sensory and semantics structures, the used languages in the activities, as well, the moment and the speech pattern in which the languages have been used. As result, identifies a strong relation between the uses of the interdependent empirical structure audio-visual language in the non-interactive episodes of authority; a decrease of this structure use in the interactive episodes and the creation of education segregation environments within the classroom.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This article is inserted in a wider study that seeks to understand the main inclusion barriers in Physics classes for students with visual impairment It aims to understand which communication context favors or impedes the visually impaired student participation to the impairment visual student’s real participation in Modern Physics activities. The research defines, from the empirical-sensory and semantics structures, the languages used in the activities, as well as, the moment and the speech pattern in which those languages have been used. As a result, this study identifies a strong relation between the uses of the interdependent empirical structure audio-visual language in the non-interactive episodes of authority; a decrease of this structure use in the interactive episodes; the creation of education segregation environments within the clasroom and the frequent use of empirical tactile-hearing interdependent language structure in these environments. Moreover, the concept of «special educational need» is discussed and its inadequate use is analyzed. Suggestions are given for its correct use of «special educational need,» its inadequate use, giving suggestions for its correct use.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

O presente artigo representa uma continuidade dos resultados apresentados em Camargo e Nardi (Revista Brasileira de Ensino de Física 29, 117 (2007)). Encontra-se inserido dentro de um estudo que busca compreender as principais barreiras para a inclusão de alunos com deficiência visual no contexto do ensino de física. Focalizando aulas de óptica, analisa as dificuldades comunicacionais entre licenciandos e discentes com deficiência visual. Para tal, enfatiza as estruturas empírica e semântico-sensorial das linguagens utilizadas, indicando fatores geradores de dificuldades de acessibilidade nas informações veiculadas. Recomenda, ainda, alternativas que visam dar condições à participação efetiva do discente com deficiência visual no processo comunicativo, das quais destacam-se: a identificação da estrutura semântico-sensorial dos significados veiculados, o conhecimento da história visual do aluno, a destituição da estrutura empírica audiovisual interdependente e a exploração das potencialidades comunicacionais das linguagens constituídas de estruturas empíricas de acesso visualmente independente. Conclui afirmando que a comunicação representa a principal barreira à participação efetiva de alunos com deficiência visual em aulas de óptica e enfatiza a importância da criação de canais comunicacionais adequados como condição básica à inclusão desses alunos.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

TEMA: programa de remediação auditivo-visual computadorizado em escolares com dislexia do desenvolvimento. OBJETIVOS: verificar a eficácia de um programa de remediação auditivo-visual computadorizado em escolares com dislexia do desenvolvimento. Dentre os objetivos específicos, o estudo teve como finalidade comparar o desempenho cognitivo-lingüístico de escolares com dislexia do desenvolvimento com escolares bons leitores; comparar os achados dos procedimentos de avaliação de pré e pós testagem em escolares com dislexia submetidos e não submetidos ao programa; e, por fim, comparar os achados do programa de remediação em escolares com dislexia e escolares bons leitores submetidos ao programa de remediação. MÉTODO: participaram deste estudo 20 escolares, sendo o grupo I (GI) subdivido em: GIe, composto de cinco escolares com dislexia do desenvolvimento submetidos ao programa, e GIc, composto de cinco escolares com dislexia do desenvolvimento não submetidos ao programa. O grupo II (GII) foi subdividido em GIIe, composto de cinco escolares bons leitores submetidos à remediação, e GIIc, composto de cinco escolares bons leitores não submetidos à remediação. Foi realizado o programa de remediação auditivo-visual computadorizado Play-on. RESULTADOS: os resultados deste estudo revelaram que o GI apresentou desempenho inferior em habilidade de processamento auditivo e de consciência fonológica em comparação com o GII em situação de pré-testagem. Entretanto, o GIe apresentou desempenho semelhante ao GII em situação de pós-testagem, evidenciando a eficácia da remediação auditivo-visual em escolares com dislexia do desenvolvimento. CONCLUSÃO: o estudo evidenciou a eficácia do programa de remediação auditivo-visual em escolares com dislexia do desenvolvimento.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

OBJETIVO: comparar o desempenho de pacientes usuários e não usuários de AASI, por meio do teste SSW. MÉTODO: o estudo foi realizado em 13 sujeitos com idade entre 55 e 85 anos, com perda auditiva bilateral, sendo seis usuários de prótese auditiva bilateral e sete não usuários de prótese auditiva. O teste de processamento auditivo aplicado foi o teste de reconhecimento de dissílabos em tarefa dicótica SSW. Foi realizado um tratamento estatístico feito por meio da técnica Bootstrap e do Teste de Hipótese Kolmogorov-Smirnov. RESULTADOS: o grupo de usuários apresentou melhor desempenho nas condições estudadas do que o grupo de não usuários, principalmente nas condições competitivas. CONCLUSÃO: os resultados obtidos nessa pesquisa apontam para a eficácia do uso do AASI na melhora da compreensão de fala da população estudada, não somente pela compensação da perda auditiva periférica, mas também pela interferência no processo de envelhecimento do sistema nervoso auditivo central.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Anaerobic threshold (AT) is usually estimated as a change point problem by visual analysis of the cardiorespiratory response to incremental dynamic exercise. In this study, two phase linear (TPL) models of the linear-linear and linear-quadratic type were used for the estimation of AT. The correlation coefficient between the classical and statistical approaches was 0.88, and 0.89 after outlier exclusion. The TPL models provide a simple method for estimating AT that can be easily implemented using a digital computer for the automatic pattern recognition of AT.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper presents some results of the application on Evolvable Hardware (EHW) in the area of voice recognition. Evolvable Hardware is able to change inner connections, using genetic learning techniques, adapting its own functionality to external condition changing. This technique became feasible by the improvement of the Programmable Logic Devices. Nowadays, it is possible to have, in a single device, the ability to change, on-line and in real-time, part of its own circuit. This work proposes a reconfigurable architecture of a system that is able to receive voice commands to execute special tasks as, to help handicapped persons in their daily home routines. The idea is to collect several voice samples, process them through algorithms based on Mel - Ceptrais theory to obtain their numerical coefficients for each sample, which, compose the universe of search used by genetic algorithm. The voice patterns considered, are limited to seven sustained Portuguese vowel phonemes (a, eh, e, i, oh, o, u).

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This letter describes a novel algorithm that is based on autoregressive decomposition and pole tracking used to recognize two patterns of speech data: normal voice and disphonic voice caused by nodules. The presented method relates the poles and the peaks of the signal spectrum which represent the periodic components of the voice. The results show that the perturbation contained in the signal is clearly depicted by pole's positions. Their variability is related to jitter and shimmer. The pole dispersion for pathological voices is about 20% higher than for normal voices, therefore, the proposed approach is a more trustworthy measure than the classical ones. © 2007.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Biometrics is one of the biggest tendencies in human identification. The fingerprint is the most widely used biometric. However considering the automatic fingerprint recognition a completely solved problem is a common mistake. The most popular and extensively used methods, the minutiae-based, do not perform well on poor-quality images and when just a small area of overlap between the template and the query images exists. The use of multibiometrics is considered one of the keys to overcome the weakness and improve the accuracy of biometrics systems. This paper presents the fusion of a minutiae-based and a ridge-based fingerprint recognition method at rank, decision and score level. The fusion techniques implemented leaded to a reduction of the Equal Error Rate by 31.78% (from 4.09% to 2.79%) and a decreasing of 6 positions in the rank to reach a Correct Retrieval (from rank 8 to 2) when assessed in the FVC2002-DB1A database. © 2008 IEEE.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)