Biblioteca Digital

994 resultados para Speech interaction

Improving Speech Interaction in Vehicles Using Context-Aware Information through A SCXML Framework

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Speech Technologies can provide important benefits for the development of more usable and safe in-vehicle human-machine interactive systems (HMIs). However mainly due robustness issues, the use of spoken interaction can entail important distractions to the driver. In this challenging scenario, while speech technologies are evolving, further research is necessary to explore how they can be complemented with both other modalities (multimodality) and information from the increasing number of available sensors (context-awareness). The perceived quality of speech technologies can significantly be increased by implementing such policies, which simply try to make the best use of all the available resources; and the in vehicle scenario is an excellent test-bed for this kind of initiatives. In this contribution we propose an event-based HMI design framework which combines context modelling and multimodal interaction using a W3C XML language known as SCXML. SCXML provides a general process control mechanism that is being considered by W3C to improve both voice interaction (VoiceXML) and multimodal interaction (MMI). In our approach we try to anticipate and extend these initiatives presenting a flexible SCXML-based approach for the design of a wide range of multimodal context-aware HMI in-vehicle interfaces. The proposed framework for HMI design and specification has been implemented in an automotive OSGi service platform, and it is being used and tested in the Spanish research project MARTA for the development of several in-vehicle interactive applications.

Towards Real-time Speech Emotion Recognition for Affective E-Learning

Relevância:

70.00% 70.00%

Publicador:

Resumo:

The original article is available as an open access file on the Springer website in the following link: http://link.springer.com/article/10.1007/s10639-015-9388-2

Design and path planning for a remote-brained service robot

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This paper presents a novel robot named "TUT03-A" with expert systems, speech interaction, vision systems etc. based on remote-brained approach. The robot is designed to have the brain and body separated. There is a cerebellum in the body. The brain with the expert systems is in charge of decision and the cerebellum control motion of the body. The brain-body. interface has many kinds of structure. It enables a brain to control one or more cerebellums. The brain controls all modules in the system and coordinates their work. The framework of the robot allows us to carry out different kinds of robotics research in an environment that can be shared and inherited over generations. Then we discuss the path planning method for the robot based on ant colony algorithm. The mathematical model is established and the algorithm is achieved with the Starlogo simulating environment. The simulation result shows that it has strong robustness and eligible pathfinding efficiency.

Les carrefours de la composition. De la musique comme art de la scène.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

La version intégrale de cette thèse est disponible uniquement pour consultation individuelle à la Bibliothèque de musique de l’Université de Montréal (www.bib.umontreal.ca/MU).

O desenvolvimento e integração de estratégias de cálculo mental no 5.º ano de escolaridade

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Relatório de estágio apresentado à Escola Superior de Educação de Lisboa para obtenção de grau de mestre em Ensino do 1.º e 2.º Ciclo do Ensino Básico

A negação nos provérbios: uma abordagem na aula de português

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Com a presente dissertação pretendemos demonstrar que a abordagem da negação, enquanto processo linguístico, deverá contemplar a amplitude das suas realizações sintáticas, morfológicas, lexicais e enunciativas. Pela sua preponderância na interação discursiva, consideramos fundamental sistematizar os mecanismos através dos quais este sistema se realiza na norma do português europeu e demonstrar a sua ocorrência num corpus amplo, autêntico e facilmente reconhecido pela maioria dos falantes do português. Entendendo os provérbios portugueses como documentos de elevado interesse cultural e linguístico, examinamos os diversos processos que, num conjunto selecionado de textos, permitem marcar os valores negativos. Neste estudo, é possível observar que as construções que compõem os textos proverbiais portugueses possuem um enorme potencial enunciativo que se manifesta, sobretudo, ao nível da interpretação, da inferência e da argumentação. Seguidamente, com o intuito de ajudar a promover as competências comunicativas dos alunos, perspetivamos uma abordagem destas temáticas ao longo da escolaridade obrigatória, assente nas orientações que emanam do Programa e das Metas Curriculares de Português, atualmente, em vigor. Os exercícios propostos e aplicados são meramente ilustrativos, todavia as conclusões decorrentes podem ser um indicador válido para futuras atuações.

A Likelihood-Maximizing Framework for Enhanced In-Car Speech Recognition Based on Speech Dialog System Interaction

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Speech recognition in car environments has been identified as a valuable means for reducing driver distraction when operating noncritical in-car systems. Under such conditions, however, speech recognition accuracy degrades significantly, and techniques such as speech enhancement are required to improve these accuracies. Likelihood-maximizing (LIMA) frameworks optimize speech enhancement algorithms based on recognized state sequences rather than traditional signal-level criteria such as maximizing signal-to-noise ratio. LIMA frameworks typically require calibration utterances to generate optimized enhancement parameters that are used for all subsequent utterances. Under such a scheme, suboptimal recognition performance occurs in noise conditions that are significantly different from that present during the calibration session – a serious problem in rapidly changing noise environments out on the open road. In this chapter, we propose a dialog-based design that allows regular optimization iterations in order to track the ever-changing noise conditions. Experiments using Mel-filterbank noise subtraction (MFNS) are performed to determine the optimization requirements for vehicular environments and show that minimal optimization is required to improve speech recognition, avoid over-optimization, and ultimately assist with semireal-time operation. It is also shown that the proposed design is able to provide improved recognition performance over frameworks incorporating a calibration session only.

Improving the speech recognition performance of beginners in spoken conversational interaction for language learning

Relevância:

40.00% 40.00%

Publicador:

Feature Selection for Speech Emotion Recognition in Spanish and Basque: On the Use of Machine Learning to Improve Human-Computer Interaction

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Study of emotions in human-computer interaction is a growing research area. This paper shows an attempt to select the most significant features for emotion recognition in spoken Basque and Spanish Languages using different methods for feature selection. RekEmozio database was used as the experimental data set. Several Machine Learning paradigms were used for the emotion classification task. Experiments were executed in three phases, using different sets of features as classification variables in each phase. Moreover, feature subset selection was applied at each phase in order to seek for the most relevant feature subset. The three phases approach was selected to check the validity of the proposed approach. Achieved results show that an instance-based learning algorithm using feature subset selection techniques based on evolutionary algorithms is the best Machine Learning paradigm in automatic emotion recognition, with all different feature sets, obtaining a mean of 80,05% emotion recognition rate in Basque and a 74,82% in Spanish. In order to check the goodness of the proposed process, a greedy searching approach (FSS-Forward) has been applied and a comparison between them is provided. Based on achieved results, a set of most relevant non-speaker dependent features is proposed for both languages and new perspectives are suggested.

Integration of speech-processing technologies into Activobank's client interaction process

Relevância:

40.00% 40.00%

Publicador:

Resumo:

This dissertation analyzes the possibilities of utilizing speech-processing technologies to transform the user experience of ActivoBank’s customers while using remote banking solutions. The technologies are examined through different criteria to determine if they support the bank’s goals and strategy and whether they should be incorporated in the bank’s offering. These criteria include the alignment with ActivoBank’s values, the suitability of the technology providers, the benefits these technologies entail, potential risks, appeal to the customers and impact on customer satisfaction. The analysis suggests that ActivoBank might not be in a position to adopt these technologies at this point in time.

Interaction and Persuasion:An analysis of the use of rhetorical devices in Gordon Brown's speech to the Labour Party Conference, on September 25, 2006

Relevância:

40.00% 40.00%

Publicador:

Resumo:

This essay has identified and analysed rhetorical devices in Gordon Brown’s speech delivered at the Labour Party conference on September 25, 2006. The aim of the study was to identify specific rhetorical devices which are described as interactional resources, analyse their uses and discuss possible effects that they may have when included in a political speech. The results are based on my own interpretations but are supported by information provided in current literature by analysts and researchers of rhetoric use. The result findings could probably serve as evidence of the need for better understanding of the devices used by politicians in their relentless endeavours to influence audience decisions.

Identifying the Common Ground of Emotion and Speech Acts in View of Human-Robot Interaction (HRI)

Relevância:

40.00% 40.00%

Publicador:

Frequency, Form, and Distribution of Illocutionary Speech Acts in Swedish Parent-Child Interaction

Relevância:

40.00% 40.00%

Publicador:

Resumo:

In this study, young children’s development of speech acts was examined. Interaction between six Swedish-speaking parents and their children was observed. The frequency, form and distribution of speech acts in the output from the parents were compared with the frequency, form and distribution of the children’s speech acts. The frequency was measured by occurrences per analysed session. The aim of the analysis was to examine if the parent’s behaviour could be treated as a baseline for the child’s development. Both the parents’ and the children’s illocutionary speech acts were classified. Each parent-child dyad was observed at four different occasions, when the children were 1;0, 1;6, 2;0, and 2;6 years of age. Similar studies have previously shown that parents keep a consistent frequency of speech acts within a given time span of interaction, though the distribution of different types of speech acts may shift, depending on contextual factors. The form, in terms of Mean Length of Speech Act in Words (MLSAw), were correlated with the longitudinal result of the children’s MLSAw. The distribution of the parents’ speech acts showed extensive individual differences. The result showed that the children’s MLSAw move significantly closer the MLSAw of their parents. Since the parent’s MLSAw showed a wide distribution, these results indicate that the parent’s speech acts can be treated as a baseline for certain aspects of the children’s development, though further studies are needed.

Phonological simplifications, apraxia of speech and the interaction between phonological and phonetic processing

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Research on aphasia has struggled to identify apraxia of speech (AoS) as an independent deficit affecting a processing level separate from phonological assembly and motor implementation. This is because AoS is characterized by both phonological and phonetic errors and, therefore, can be interpreted as a combination of deficits at the phonological and the motoric level rather than as an independent impairment. We apply novel psycholinguistic analyses to the perceptually phonological errors made by 24 Italian aphasic patients. We show that only patients with relative high rate (>10%) of phonetic errors make sound errors which simplify the phonology of the target. Moreover, simplifications are strongly associated with other variables indicative of articulatory difficulties - such as a predominance of errors on consonants rather than vowels -but not with other measures - such as rate of words reproduced correctly or rates of lexical errors. These results indicate that sound errors cannot arise at a single phonological level because they are different in different patients. Instead, different patterns: (1) provide evidence for separate impairments and the existence of a level of articulatory planning/programming intermediate between phonological selection and motor implementation; (2) validate AoS as an independent impairment at this level, characterized by phonetic errors and phonological simplifications; (3) support the claim that linguistic principles of complexity have an articulatory basis since they only apply in patients with associated articulatory difficulties.

Assessment of Speech Dialog Systems using Multi-Modal Cognitive Load Analysis and Driving Performance Metrics

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper, cognitive load analysis via acoustic- and CAN-Bus-based driver performance metrics is employed to assess two different commercial speech dialog systems (SDS) during in-vehicle use. Several metrics are proposed to measure increases in stress, distraction and cognitive load and we compare these measures with statistical analysis of the speech recognition component of each SDS. It is found that care must be taken when designing an SDS as it may increase cognitive load which can be observed through increased speech response delay (SRD), changes in speech production due to negative emotion towards the SDS, and decreased driving performance on lateral control tasks. From this study, guidelines are presented for designing systems which are to be used in vehicular environments.

«
1
2
3
4
5
6
7
8
...
66
67
»