Biblioteca Digital

904 resultados para Audio-Visual Automatic Speech Recognition

Development & evaluation of different acoustic models for Malayalam continuous speech recognition

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Performance of any continuous speech recognition system is dependent on the accuracy of its acoustic model. Hence, preparation of a robust and accurate acoustic model lead to satisfactory recognition performance for a speech recognizer. In acoustic modeling of phonetic unit, context information is of prime importance as the phonemes are found to vary according to the place of occurrence in a word. In this paper we compare and evaluate the effect of context dependent tied (CD tied) models, context dependent (CD) and context independent (CI) models in the perspective of continuous speech recognition of Malayalam language. The database for the speech recognition system has utterance from 21 speakers including 11 female and 10 males. Our evaluation results show that CD tied models outperforms CI models over 21%.

A Comparative study of HMM and SVM in Malayalam Digit Recognition

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A primary medium for the human beings to communicate through language is Speech. Automatic Speech Recognition is wide spread today. Recognizing single digits is vital to a number of applications such as voice dialling of telephone numbers, automatic data entry, credit card entry, PIN (personal identification number) entry, entry of access codes for transactions, etc. In this paper we present a comparative study of SVM (Support Vector Machine) and HMM (Hidden Markov Model) to recognize and identify the digits used in Malayalam speech.

A Hybrid Architecture for Recognising Speech Signals in Malayalam

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Speech is the primary, most prominent and convenient means of communication in audible language. Through speech, people can express their thoughts, feelings or perceptions by the articulation of words. Human speech is a complex signal which is non stationary in nature. It consists of immensely rich information about the words spoken, accent, attitude of the speaker, expression, intention, sex, emotion as well as style. The main objective of Automatic Speech Recognition (ASR) is to identify whatever people speak by means of computer algorithms. This enables people to communicate with a computer in a natural spoken language. Automatic recognition of speech by machines has been one of the most exciting, significant and challenging areas of research in the field of signal processing over the past five to six decades. Despite the developments and intensive research done in this area, the performance of ASR is still lower than that of speech recognition by humans and is yet to achieve a completely reliable performance level. The main objective of this thesis is to develop an efficient speech recognition system for recognising speaker independent isolated words in Malayalam.

Departament de Psicologia [Ressenya del llibre Adolescents and audio-visual media in five countries, de Ferran Casas, Irene Rizzini, Rose September, Per Egil Mjaavatn i Usha Nayar (coord.)]

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Aquest llibre és el producte d'anys de cooperació entre equips de recerca de cinc països diferents, tot ells Key Institutions de la xarxa Childwatch International, en el marc d'un projecte plurinacional sobre adolescents i mitjans

Confección del documental audio-visual histórico-etnográfico : los religiosos castellano-leoneses en la evangelización de América.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

No está publicado.

Iris : proyecto audio-visual : base de datos e im??genes sobre personas con discapacidad

Relevância:

100.00% 100.00%

Publicador:

Resumo:

El v??deo est?? realizado por profesores de las ??reas de Did??ctica y Organizaci??n Escolar y Teor??a e Historia de la Educaci??n, durante los a??os 2000 y 2001. Recoge la opini??n de profesionales, padres y madres y personas con discapacidad f??sica (sordos, ciegos y P.C.I.)y personas con discapacidad mental en relaci??n con diferentes aspectos de la vida diaria: hogar, inserci??n laboral, etc. Este recurso did??ctico est?? dise??ado para el visionado, interpretaci??n te??rico-pr??ctica y contraste de opiniones en el aula, de ense??anza superior, para abordar la formaci??n de los profesionales que van a desarrollar su actividad con personas con discapacidad.

La metodología estructuro-global audio-visual y la enseñanza del francés a distancia.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

La función de la Lengua en el Bachillerato es triple: como factor de promoción socio-económica que permite en algunos casos obtener mejoras salariales y en otros alcanzar puestos vedados a los que no conocen idiomas, la UNESCO recomienda su estudio por su función educativa respecto al ser humano, integrante de los distintos grupos nacionales, enriquecimiento del sentido crítico y de tolerancia al apreciar las diferencias y semejanzas de los distintos pueblos, una cultura humanista que debe procurar el estudio de la lengua francesa, máxime para nosotros si tenemos en cuenta que es un país fronterizo nuestro y que permite el camino para llegar a Europa, es lógico que la lengua francesa sea tan importante para nosotros debido a las relaciones comerciales, económicas, etcétera que se desarrollan en esta lengua.; como tercera función, y primordial, el apredizaje de, por lo menos, un idioma, es primordial para la formación de la personalidad. A partir de 1975 son importantes los avances conseguidos en el estudio de un idioma, sobre todo los esfuerzos de renovación didáctica, destacando las aportaciones de la metodología estructuroglobal audiovisual, nacida a partir de los años cincuenta y que está siendo renovada constantemente. Si el alumno ha de aprender el francés a distancia debe tener un material adecuado a través de cassettes con diálogos para aprender a pronunciar correctamente. Después se aprenderá a leer y escribir porque se supone que se sabe pronunciar correctamente y el transcribir la lengua oral es un ejercicio para fijar los conocimientos. Pero el aprendizaje de un idioma debe realizarse dedicando todos los días un tiempo concreto, esta regularidad es la permite aprenderlo. Así, en cada caso el alumno deberá actuar de acuerdo con las orientaciones más precisas y personales de su profesor-tutor y con sus hábitos de trabajo siempre y cuando resulten eficaces.

Utilizing hearing assistive technology (HAT) to assess speech recognition: Comparison of word recognition scores obtained by hearing instrument users

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The ability for individuals with hearing loss to accurately recognize correct versus incorrect verbal responses during traditional word recognition testing across four different listening conditions was assessed.

Economic implications of new communication technologies on the audio-visual markets

Relevância:

100.00% 100.00%

Publicador:

The private copying of sound and audio-visual recordings

Relevância:

100.00% 100.00%

Publicador:

Televised Whorf: cognitive restructuring in advanced foreign language learners as a function of audio-visual media exposure

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The encoding of goal-oriented motion events varies across different languages. Speakers of languages without grammatical aspect (e.g., Swedish) tend to mention motion endpoints when describing events, e.g., “two nuns walk to a house,”, and attach importance to event endpoints when matching scenes from memory. Speakers of aspect languages (e.g., English), on the other hand, are more prone to direct attention to the ongoingness of motion events, which is reflected both in their event descriptions, e.g., “two nuns are walking.”, and in their non-verbal similarity judgements. This study examines to what extent native speakers of Swedish (n = 82) with English as a foreign language (FL) restructure their categorisation of goal-oriented motion as a function of their English proficiency and experience with the English language (e.g., exposure, learning). Seventeen monolingual native English speakers from the United Kingdom (UK) were engaged for comparison purposes. Data on motion event cognition were collected through a memory-based triads matching task, in which a target scene with an intermediate degree of endpoint orientation was matched with two alternative scenes with low and high degrees of endpoint orientation, respectively. Results showed that the preference among the Swedish speakers of L2 English to base their similarity judgements on ongoingness rather than event endpoints was correlated with their use of English in their everyday lives, such that those who often watched television in English approximated the ongoingness preference of the English native speakers. These findings suggest that event cognition patterns may be restructured through the exposure to FL audio-visual media. The results thus add to the emerging picture that learning a new language entails learning new ways of observing and reasoning about reality.

N1 enhancement in synesthesia during visual and audio-visual perception in semantic cross-modal conflict situations: an ERP study

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Synesthesia entails a special kind of sensory perception, where stimulation in one sensory modality leads to an internally generated perceptual experience of another, not stimulated sensory modality. This phenomenon can be viewed as an abnormal multisensory integration process as here the synesthetic percept is aberrantly fused with the stimulated modality. Indeed, recent synesthesia research has focused on multimodal processing even outside of the specific synesthesia-inducing context and has revealed changed multimodal integration, thus suggesting perceptual alterations at a global level. Here, we focused on audio-visual processing in synesthesia using a semantic classification task in combination with visually or auditory-visually presented animated and in animated objects in an audio-visual congruent and incongruent manner. Fourteen subjects with auditory-visual and/or grapheme-color synesthesia and 14 control subjects participated in the experiment. During presentation of the stimuli, event-related potentials were recorded from 32 electrodes. The analysis of reaction times and error rates revealed no group differences with best performance for audio-visually congruent stimulation indicating the well-known multimodal facilitation effect. We found enhanced amplitude of the N1 component over occipital electrode sites for synesthetes compared to controls. The differences occurred irrespective of the experimental condition and therefore suggest a global influence on early sensory processing in synesthetes.

The relation between speech recognition in noise and the speech-evoked brainstem response in normal-hearing and hearing-impaired individuals

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Little is known about the way speech in noise is processed along the auditory pathway. The purpose of this study was to evaluate the relation between listening in noise using the R-Space system and the neurophysiologic response of the speech-evoked auditory brainstem when recorded in quiet and noise in adult participants with mild to moderate hearing loss and normal hearing.

Speech recognition in reverberation in biomodal cochlear implant users

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The purpose of the present study was to evaluate the effects of bimodal (implant plus hearing aid) listening on speech recognition in four different environment conditions. Results indicate that there was little difference in the cochlear implant only and bimodal conditions.

Automatic landslide recognition through Optimum-Path Forest

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper we shed light over the problem of landslide automatic recognition using supervised classification, and we also introduced the OPF classifier in this context. We employed two images acquired from Geoeye-MS satellite at March-2010 in the northwest (high steep areas) and north sides (pipeline area) covering the area of Duque de Caxias city, Rio de Janeiro State, Brazil. The landslide recognition rate has been assessed through a cross-validation with 10 runnings. In regard to the classifiers, we have used OPF against SVM with Radial Basis Function for kernel mapping and a Bayesian classifier. We can conclude that OPF, Bayes and SVM achieved high recognition rates, being OPF the fastest approach. © 2012 IEEE.

«
1
2
3
4
5
6
7
8
...
60
61
»