908 resultados para Automatic speech recognition (ASR)


Relevância:

100.00% 100.00%

Publicador:

Resumo:

RÃSUMÃ. La prise en compte des troubles de la communication dans lâutilisation des systèmes de recherche dâinformation tels quâon peut en trouver sur le Web est généralement réalisée par des interfaces utilisant des modalités nâimpliquant pas la lecture et lâécriture. Peu dâapplications existent pour aider lâutilisateur en difficulté dans la modalité textuelle. Nous proposons la prise en compte de la conscience phonologique pour assister lâutilisateur en difficulté dâécriture de requêtes (dysorthographie) ou de lecture de documents (dyslexie). En premier lieu un système de réécriture et dâinterprétation des requêtes entrées au clavier par lâutilisateur est proposé : en sâappuyant sur les causes de la dysorthographie et sur les exemples à notre disposition, il est apparu quâun système combinant une approche éditoriale (type correcteur orthographique) et une approche orale (système de transcription automatique) était plus approprié. En second lieu une méthode dâapprentissage automatique utilise des critères spécifiques , tels que la cohésion grapho-phonémique, pour estimer la lisibilité dâune phrase, puis dâun texte. ABSTRACT. Most applications intend to help disabled users in the information retrieval process by proposing non-textual modalities. This paper introduces specific parameters linked to phonological awareness in the textual modality. This will enhance the ability of systems to deal with orthographic issues and with the adaptation of results to the reader when for example the reader is dyslexic. We propose a phonology based sentence level rewriting system that combines spelling correction, speech synthesis and automatic speech recognition. This has been evaluated on a corpus of questions we get from dyslexic children. We propose a specific sentence readability measure that involves phonetic parameters such as grapho-phonemic cohesion. This has been learned on a corpus of reading time of sentences read by dyslexic children.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Spoken term detection (STD) is the task of looking up a spoken term in a large volume of speech segments. In order to provide fast search, speech segments are first indexed into an intermediate representation using speech recognition engines which provide multiple hypotheses for each speech segment. Approximate matching techniques are usually applied at the search stage to compensate the poor performance of automatic speech recognition engines during indexing. Recently, using visual information in addition to audio information has been shown to improve phone recognition performance, particularly in noisy environments. In this paper, we will make use of visual information in the form of lip movements of the speaker in indexing stage and will investigate its effect on STD performance. Particularly, we will investigate if gains in phone recognition accuracy will carry through the approximate matching stage to provide similar gains in the final audio-visual STD system over a traditional audio only approach. We will also investigate the effect of using visual information on STD performance in different noise environments.

Relevância:

100.00% 100.00%

Publicador:

Relevância:

100.00% 100.00%

Publicador:

Relevância:

100.00% 100.00%

Publicador:

Relevância:

100.00% 100.00%

Publicador:

Relevância:

100.00% 100.00%

Publicador:

Relevância:

100.00% 100.00%

Publicador:

Relevância:

100.00% 100.00%

Publicador:

Relevância:

100.00% 100.00%

Publicador: