817 resultados para Multimodal
Resumo:
Full paper presented at EC-TEL 2016
Resumo:
This paper describes a substantial effort to build a real-time interactive multimodal dialogue system with a focus on emotional and non-verbal interaction capabilities. The work is motivated by the aim to provide technology with competences in perceiving and producing the emotional and non-verbal behaviours required to sustain a conversational dialogue. We present the Sensitive Artificial Listener (SAL) scenario as a setting which seems particularly suited for the study of emotional and non-verbal behaviour, since it requires only very limited verbal understanding on the part of the machine. This scenario allows us to concentrate on non-verbal capabilities without having to address at the same time the challenges of spoken language understanding, task modeling etc. We first summarise three prototype versions of the SAL scenario, in which the behaviour of the Sensitive Artificial Listener characters was determined by a human operator. These prototypes served the purpose of verifying the effectiveness of the SAL scenario and allowed us to collect data required for building system components for analysing and synthesising the respective behaviours. We then describe the fully autonomous integrated real-time system we created, which combines incremental analysis of user behaviour, dialogue management, and synthesis of speaker and listener behaviour of a SAL character displayed as a virtual agent. We discuss principles that should underlie the evaluation of SAL-type systems. Since the system is designed for modularity and reuse, and since it is publicly available, the SAL system has potential as a joint research tool in the affective computing research community.
Resumo:
Situational awareness is achieved naturally by the human senses of sight and hearing in combination. Automatic scene understanding aims at replicating this human ability using microphones and cameras in cooperation. In this paper, audio and video signals are fused and integrated at different levels of semantic abstractions. We detect and track a speaker who is relatively unconstrained, i.e., free to move indoors within an area larger than the comparable reported work, which is usually limited to round table meetings. The system is relatively simple: consisting of just 4 microphone pairs and a single camera. Results show that the overall multimodal tracker is more reliable than single modality systems, tolerating large occlusions and cross-talk. System evaluation is performed on both single and multi-modality tracking. The performance improvement given by the audio–video integration and fusion is quantified in terms of tracking precision and accuracy as well as speaker diarisation error rate and precision–recall (recognition). Improvements vs. the closest works are evaluated: 56% sound source localisation computational cost over an audio only system, 8% speaker diarisation error rate over an audio only speaker recognition unit and 36% on the precision–recall metric over an audio–video dominant speaker recognition method.
Resumo:
In-situ characterisation of thermocouple sensors is a challenging problem. Recently the authors presented a blind characterisation technique based on the cross-relation method of blind identification. The method allows in-situ identification of two thermocouple probes, each with a different dynamic response, using only sampled sensor measurement data. While the technique offers certain advantages over alternative methods, including low estimation variance and the ability to compensate for noise induced bias, the robustness of the method is limited by the multimodal nature of the cost function. In this paper, a normalisation term is proposed which improves the convexity of
the cost function. Further, a normalisation and bias compensation hybrid approach is presented that exploits the advantages of both normalisation and bias compensation. It is found that the optimum of the hybrid cost function is less biased and more stable than when only normalisation is applied. All results were verified by simulation.
Resumo:
Background: Sociocultural theories state that learning results from people participating in contexts where social interaction is facilitated. There is a need to create such facilitated pedagogical spaces where participants share their ways of knowing and doing. The aim of this exploratory study was to introduce pedagogical space for sociocultural interaction using ‘Identity Text’.
Methods: Identity texts are sociocultural artifacts produced by participants, which can be written, spoken, visual, musical, or multimodal. In 2013, participants of an international medical education fellowship program were asked to create their own Identity Texts to promote discussion about participants’ cultural backgrounds. Thematic analysis was used to make the analysis relevant to studying the pedagogical utility of the intervention.
Result: The Identity Text intervention created two spaces: a ‘reflective space’ helped
participants reflect on sensitive topics like institutional environments, roles in
interdisciplinary teams, and gender discrimination. A ‘narrative space’ allowed
participants to tell powerful stories that provided cultural insights and challenged cultural hegemony; they described the conscious and subconscious transformation in identity that evolved secondary to struggles with local power dynamics and social demands involving the impact of family, peers and country of origin.
Conclusion: Whilst the impact of providing pedagogical space using Identity Text on
cognitive engagement and enhanced learning requires further research, the findings of
this study suggest that it is a useful pedagogical strategy to support cross-cultural
education.
Resumo:
Som övergripande syfte vill denna studie undersöka hur två förskolegrupper kan meningsskapa i två olika utställningar genom att beskriva, analysera och jämföra dessa två förskolegruppers multimodala kommunikation vid utställningsbesök. Genom syftet vill denna studie ge ökad kunskap om multimodal kommunikation hos två förskolegrupper i två olika utställningar. I syftet ingår att studien önskar öka kunskapen om hur semiotiska resurser kan användas och vad som fokuseras genom språkbruk, som kommunikativa villkor för lärande i utställningar. Studien har en design av en komparativ etnografisk fallstudie. Videoinspelningar, MP-3 ljudinspelningar, deltagande observation och fältanteckningar har använts som instrument för datainsamling. Genom strategiska urval valdes 11 stycken treåriga förskolebarn och tre pedagoger för att besöka en utställning vid Naturhistoriska Museet och nio femåriga förskolebarn med två pedagoger for att besöka en utställning vid Tom Tits Experiment (science center). Det insamlade materialet transkriberades multimodalt och kunde därefter analyseras som text. Multimodal interaktionsanalys (Norris 2004, 2014) och ett språkbruksraster (Rostvall & West 2001) användes för analys. Resultatet visade markanta skillnader mellan utställningarna. Av den multimodala interaktionsanalysen visade utställningen vid ett science center en bredare användning av semiotiska resurser, då alla artefakters meningserbjudanden var både visuella och taktila, en del artefakter erbjöd även auditiva meningserbjudanden. Språkbruket i samma utställning var mycket varierat och barnen anmodade varandra. Slutsatsen blir av detta att förskolegruppernas a) meningsskapande villkoras av språkbruk och att b) om meningsskapandet ska ske i samarbete ökas den möjligheten dess mer symmetriska de sociala relationerna är.
Resumo:
The physical appearance and behavior of a robot is an important asset in terms of Human-Computer Interaction. Multimodality is also fundamental, as we humans usually expect to interact in a natural way with voice, gestures, etc. People approach complex interaction devices with stances similar to those used in their interaction with other people. In this paper we describe a robot head, currently under development, that aims to be a multimodal (vision, voice, gestures,...) perceptual user interface.
Resumo:
This document describes planned investments in Iowa’s multimodal transportation system including aviation, transit, railroads, trails, and highways. This five-year program documents $3.5 billion of highway and bridge construction projects on the primary road system using federal and state funding. Of that funding, a little over $500 million is available due to the passage of Senate File 257 in February 2015. As required by Senate File 257, this program includes a list of the critical highway and bridge projects funded with the additional revenue. Since last year’s program, a new federal surface transportation authorization bill was passed and signed into law. This authorization bill is titled Fixing America’s Surface Transportation (FAST) Act. The FAST Act, for the first time in many years, provides federal funding certainty over most of the time covered by this Program. In addition, it provided additional federal funding for highway and bridge projects.
Resumo:
The present thesis explores how interaction is initiated in multi-party meetings in Adobe Connect, 7.0, with a particular focus on how co-presence and mutual availability are established through the preambles of 18 meetings held in Spanish without a moderator. Taking Conversation Analysis (CA) as a methodological point of departure, this thesis comprises four different studies, each of them analyzing a particular phenomenon within the interaction of the preambles in a multimodal environment that allows simultaneous interaction through video, voice and text-chat. The first study (Artículo I) shows how participants solve jointly the issue of availability in a technological environment where being online is not necessarily understood as being available for communicating. The second study (Artículo II) focuses on the beginning of the audiovisual interaction; in particular on how participants check the right functioning of the audiovisual mode. The third study (Artículo III) explores silences within the interaction of the preamble. It shows that the length of gaps and lapses become a significant aspect the preambles and how they are connected to the issue of availability. Finally, the four study introduces the notion of modal alignment, an interactional phenomenon that systematically appears in the beginnings of the encounters, which seems to be used and understood as a strategy for the establishment of mutual availability and negotiation of the participation framework. As a whole, this research shows how participants, in order to establish mutual co-presence and availability, adapt to a particular technology in terms of participation management, deploying strategies and conveying successive actions which, as it is the case of the activation of their respective webcams, seem to be understood as predictable within the intricate process of establishing mutual availability before the meeting starts.
Resumo:
Im Forschungsprojekt "Prozesse der Sprachförderung im Kindergarten – ProSpiK" werden Gespräche zwischen Lehrpersonen und Kindern gefilmt und sequenzanalytisch ausgewertet, um ihre Potenziale für den Erwerb und die Förderung bildungssprachlicher Fähigkeiten zu untersuchen. Ziel ist es, Grundlagen für eine stufengerechte (integrierte, situations- und themenorientierte) Sprachdidaktik zu erarbeiten, die Bildungsungleichheit abbauen hilft, anstatt sie zu reproduzieren. In der Nummer 3/2014 der Schweizerischen Zeitschrift für Bildungswissenschaften wurden die Anlage des Projekts und erste Auswertungsergebnisse (zum Phänomen «Wechsel von Referenzräumen») vorgestellt (Isler, Künzli, & Wiesner, 2014). Der vorliegende Beitrag befasst sich weiter vertiefend mit der Ausgestaltung von pädagogischen Gesprächen: Es wird untersucht, mit welchen kommunikativen Mitteln die Kinder beim Erwerb von Fähigkeiten des Argumentierens (Beziehen und Begründen eigener Positionen) unterstützt werden können.2 Im ersten Abschnitt geht es um die Bedeutung der Prozessqualität in der frühen Bildung und um Gespräche als Erwerbskontexte sprachlicher Fähigkeiten. Im zweiten Abschnitt werden zentrale theoretische Konzepte dargestellt, die unseren Analysen zugrunde liegen. Der dritte Abschnitt gibt einen exemplarischen Einblick in das Datenmaterial und die Auswertungsarbeiten. Im vierten Abschnitt wird anhand einer exemplarischen Analyse gezeigt, wie im Kindergarten multimodal gelernt werden kann. Abschliessend werden die Ergebnisse mit Bezug auf die Forschungsfragen des Projekts diskutiert. (DIPF/Orig.)
Resumo:
Obesity is a major challenge to human health worldwide. Little is known about the brain mechanisms that are associated with overeating and obesity in humans. In this project, multimodal neuroimaging techniques were utilized to study brain neurotransmission and anatomy in obesity. Bariatric surgery was used as an experimental method for assessing whether the possible differences between obese and non-obese individuals change following the weight loss. This could indicate whether obesity-related altered neurotransmission and cerebral atrophy are recoverable or whether they represent stable individual characteristics. Morbidly obese subjects (BMI ≥ 35 kg/m2) and non-obese control subjects (mean BMI 23 kg/m2) were studied with positron emission tomography (PET) and magnetic resonance imaging (MRI). In the PET studies, focus was put on dopaminergic and opioidergic systems, both of which are crucial in the reward processing. Brain dopamine D2 receptor (D2R) availability was measured using [11C]raclopride and µ-opioid receptor (MOR) availability using [11C]carfentanil. In the MRI studies, voxel-based morphometry (VBM) of T1-weighted MRI images was used, coupled with diffusion tensor imaging (DTI). Obese subjects underwent bariatric surgery as their standard clinical treatment during the study. Preoperatively, morbidly obese subjects had significantly lower MOR availability but unaltered D2R availability in several brain regions involved in reward processing, including striatum, insula, and thalamus. Moreover, obesity disrupted the interaction between the MOR and D2R systems in ventral striatum. Bariatric surgery and concomitant weight loss normalized MOR availability in the obese, but did not influence D2R availability in any brain region. Morbidly obese subjects had also significantly lower grey and white matter densities globally in the brain, but more focal changes were located in the areas associated with inhibitory control, reward processing, and appetite. DTI revealed also signs of axonal damage in the obese in corticospinal tracts and occipito-frontal fascicles. Surgery-induced weight loss resulted in global recovery of white matter density as well as more focal recovery of grey matter density among obese subjects. Altogether these results show that the endogenous opioid system is fundamentally linked to obesity. Lowered MOR availability is likely a consequence of obesity and may mediate maintenance of excessive energy uptake. In addition, obesity has adverse effects on brain structure. Bariatric surgery however reverses MOR dysfunction and recovers cerebral atrophy. Understanding the opioidergic contribution to overeating and obesity is critical for developing new psychological or pharmacological treatments for obesity. The actual molecular mechanisms behind the positive change in structure and neurotransmitter function still remain unclear and should be addressed in the future research.
Resumo:
L’accessibilité universelle est de nos jours très importante pour nos villes, car elle permet à toute personne, ayant des incapacités physiques ou non, de mener à bien ses activités socio-professionnelles. À travers le monde, plusieurs projets ont vu le jour comme AXS Map à New York ou AccesSIG en France. Au Canada, un projet multidisciplinaire nommé MobiliSIG ayant pour lieu d’expérimentation la ville de Québec a vu le jour en 2013. L’objectif du projet MobiliSIG est de concevoir et développer une application multimodale d’assistance au déplacement des personnes à mobilité réduite. Ce projet se concentre principalement sur la constitution d’une base de données d’accessibilité se référant au modèle PPH (Processus de Production du Handicap). Nos travaux visent à définir la diffusion d’itinéraires adaptés, adaptables et adaptatifs liés à des contextes multi-utilisateurs, multiplateformes, multimodaux (interfaces et transports) et multi-environnements. Après une revue de littérature et afin d’identifier et définir les besoins liés à cette diffusion des données de navigation, nous nous sommes attelés à la description de plusieurs scénarios pour mieux comprendre les besoins des utilisateurs : planification d’un déplacement et navigation dans le milieu urbain ; parcours multimodal ; recherche d’un point d’intérêt (toilettes accessibles). Cette démarche nous a permis également d’identifier les modes de communication et représentations souhaitées de l’itinéraire (carte, texte, image, parole, …) et de proposer une approche basée sur la transformation de l’itinéraire reçu de la base de données d’accessibilité. Cette transformation est effectuée en tenant compte des préférences de l’utilisateur, de son appareil et de son environnement. La diffusion de l’itinéraire se fait ensuite par un service web de diffusion conçu selon le standard du W3C sur les architectures multimodales (MMI) en combinaison avec le concept de la plasticité des interfaces. Le prototype développé a permis d’avoir comme résultat un système qui diffuse de façon générique l’information de navigation adaptée, adaptable et adaptative à l’utilisateur, à son appareil et à son environnement.
Resumo:
When teaching students with visual impairments educators generally rely on tactile tools to depict visual mathematical topics. Tactile media, such as embossed paper and simple manipulable materials, are typically used to convey graphical information. Although these tools are easy to use and relatively inexpensive, they are solely tactile and are not modifiable. Dynamic and interactive technologies such as pin matrices and haptic pens are also commercially available, but tend to be more expensive and less intuitive. This study aims to bridge the gap between easy-to-use tactile tools and dynamic, interactive technologies in order to facilitate the haptic learning of mathematical concepts. We developed an haptic assistive device using a Tanvas electrostatic touchscreen that provides the user with multimodal (haptic, auditory, and visual) output. Three methodological steps comprise this research: 1) a systematic literature review of the state of the art in the design and testing of tactile and haptic assistive devices, 2) a user-centered system design, and 3) testing of the system’s effectiveness via a usability study. The electrostatic touchscreen exhibits promise as an assistive device for displaying visual mathematical elements via the haptic modality.