Biblioteca Digital

74 resultados para recommender system, user profiling, personalization, implicit feedbacks

User-independent accelerometer-based gesture recognition for mobile devices

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Many mobile devices embed nowadays inertial sensors. This enables new forms of human-computer interaction through the use of gestures (movements performed with the mobile device) as a way of communication. This paper presents an accelerometer-based gesture recognition system for mobile devices which is able to recognize a collection of 10 different hand gestures. The system was conceived to be light and to operate in a user -independent manner in real time. The recognition system was implemented in a smart phone and evaluated through a collection of user tests, which showed a recognition accuracy similar to other state-of-the art techniques and a lower computational complexity. The system was also used to build a human -robot interface that enables controlling a wheeled robot with the gestures made with the mobile phone.

Multiservice capacity and interference statistics of the uplink of high altitude platforms (HAPs) for asynchronous and synchronous WCDMA system

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this work, the capacity and the interference statistics of the uplink of high-altitude platforms (HAPs) for asynchronous and synchronous WCDMA system assuming finite transmission power and imperfect power control are studied. Propagation loss used to calculate the received signal power is due to the distance, shadowing, and wall insertion loss. The uplink capacity for 3- and 3.75-G services is given for different cell radius assuming outdoor and indoor voice users only, data users only and a combination of the two services. For 37 macrocells HAP, the total uplink capacity is 3,034 outdoor voice users or 444 outdoor data users. When one or more than one user is an indoor user, the uplink capacity is 2,923 voice users or 444 data users when the walls entry loss is 10 dB. It is shown that the effect of the adjacent channels interference is very small.

Design and modeling of the multi-agent robotic system: SMART

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This article presents the design, kinematic model and communication architecture for the multi-agent robotic system called SMART. The philosophy behind this kind of system requires the communication architecture to contemplate the concurrence of the whole system. The proposed architecture combines different communication technologies (TCP/IP and Bluetooth) under one protocol designed for the cooperation among agents and other elements of the system such as IP-Cameras, image processing library, path planner, user Interface, control block and data block. The high level control is modeled by Work-Flow Petri nets and implemented in C++ and C♯♯. Experimental results show the performance of the designed architecture.

Assessing user bias in affect detection within context-based spoken dialog systems

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper presents an empirical evidence of user bias within a laboratory-oriented evaluation of a Spoken Dialog System. Specifically, we addressed user bias in their satisfaction judgements. We question the reliability of this data for modeling user emotion, focusing on contentment and frustration in a spoken dialog system. This bias is detected through machine learning experiments that were conducted on two datasets, users and annotators, which were then compared in order to assess the reliability of these datasets. The target used was the satisfaction rating and the predictors were conversational/dialog features. Our results indicated that standard classifiers were significantly more successful in discriminating frustration and contentment and the intensities of these emotions (reﬂected by user satisfaction ratings) from annotator data than from user data. Indirectly, the results showed that conversational features are reliable predictors of the two abovementioned emotions.

User benefit assessment by a dynamic LUTI model for Madrid

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Assessing users’ benefit in a transport policy implementation has been studied by many researchers using theoretical or empirical measures. However, few of them measure users’ benefit in a different way from the consumer surplus. Therefore, this paper aims to assess a new measure of user benefits by weighting consumer surplus in order to include equity assessment for different transport policies simulated in a dynamic middle-term LUTI model adapted to the case study of Madrid. Three different transport policies, including road pricing, parking charge and public transport improvement have been simulated through the Metropolitan Activity Relocation Simulator, MARS, the LUTI calibrated model for Madrid). A social welfare function (WF) is defined using a cost benefit analysis function that includes mainly costs and benefits of users and operators of the transport system. Particularly, the part of welfare function concerning the users, (i.e. consumer surplus), is modified by a compensating weight (CW) which represents the inverse of household income level. Based on the modified social welfare function, the effects on the measure of users benefits are estimated and compared with the old WF ́s results as well. The result of the analysis shows that road pricing leads a negative effect on the users benefits specially on the low income users. Actually, the road pricing and parking charge implementation results like a regressive policy especially at long term. Public transport improvement scenario brings more positive effects on low income user benefits. The integrated (road pricing and increasing public services) policy scenario is the one which receive the most user benefits. The results of this research could be a key issue to understanding the relationship between transport systems policies and user benefits distribution in a metropolitan context.

ClimApp: A Novel Approach of an Intelligent HVAC Control System

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper describes a novel deployment of an intelligent user-centered HVAC (Heating, Ventilating and Air Conditioner) control system. The main objective of this system is to optimize user comfort and to reduce energy consumption in office buildings. Existing commercial HVAC control systems work in a fixed and predetermined way. The novelty of the proposed system is that it adapts dynamically to the user and to the building environment. For this purpose the system architecture has been designed under the paradigm of Ambient Intelligence. A prototype of the system proposed has been tested in a real-world environment.

Mars: a personalised mobile activity recognition system

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Mobile activity recognition focuses on inferring the current activities of a mobile user by leveraging the sensory data that is available on today’s smart phones. The state of the art in mobile activity recognition uses traditional classification learning techniques. Thus, the learning process typically involves: i) collection of labelled sensory data that is transferred and collated in a centralised repository; ii) model building where the classification model is trained and tested using the collected data; iii) a model deployment stage where the learnt model is deployed on-board a mobile device for identifying activities based on new sensory data. In this paper, we demonstrate the Mobile Activity Recognition System (MARS) where for the first time the model is built and continuously updated on-board the mobile device itself using data stream mining. The advantages of the on-board approach are that it allows model personalisation and increased privacy as the data is not sent to any external site. Furthermore, when the user or its activity profile changes MARS enables promptly adaptation. MARS has been implemented on the Android platform to demonstrate that it can achieve accurate mobile activity recognition. Moreover, we can show in practise that MARS quickly adapts to user profile changes while at the same time being scalable and efficient in terms of consumption of the device resources.

Methodology for developing a Speech into Sign Language Translation System in a New Semantic Domain

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper proposes a methodology for developing a speech into sign language translation system considering a user-centered strategy. This method-ology consists of four main steps: analysis of technical and user requirements, data collection, technology adaptation to the new domain, and finally, evalua-tion of the system. The two most demanding tasks are the sign generation and the translation rules generation. Many other aspects can be updated automatical-ly from a parallel corpus that includes sentences (in Spanish and LSE: Lengua de Signos Española) related to the application domain. In this paper, we explain how to apply this methodology in order to develop two translation systems in two specific domains: bus transport information and hotel reception.

Distributed Web Interface for Real-Time Spatial Audio Reproduction System

Relevância:

30.00% 30.00%

Publicador:

Resumo:

SSR es el acrónimo de SoundScape Renderer (tool for real-time spatial audio reproduction providing a variety of rendering algorithms), es un programa escrito en su mayoría en C++. El programa permite al usuario escuchar tanto sonidos grabados con anterioridad como sonidos en directo. El sonido o los sonidos se oirán, desde el punto de vista del oyente, como si el sonido se produjese en el punto que el programa decida, lo interesante de este proyecto es que el sonido podrá cambiar de lugar, moverse, etc. Todo en tiempo real. Esto se consigue sin modificar el sonido al grabarlo pero sí al emitirlo, el programa calcula las variaciones necesarias para que al emitir el sonido al oyente le llegue como si el sonido realmente se generase en un punto del espacio o lo más parecido posible. La sensación de movimiento no deja de ser el punto anterior cambiando de lugar. La idea era crear una aplicación web basada en Canvas de HTML5 que se comunicará con esta interfaz de usuario remota. Así se solucionarían todos los problemas de compatibilidad ya que cualquier dispositivo con posibilidad de visualizar páginas web podría correr una aplicación basada en estándares web, por ejemplo un sistema con Windows o un móvil con navegador. El protocolo debía de ser WebSocket porque es un protocolo HTML5 y ofrece las “garantías” de latencia que una aplicación con necesidades de información en tiempo real requiere. Nos permite una comunicación full-dúplex asíncrona sin mucho payload que es justo lo que se venía a evitar al no usar polling normal de HTML. El problema que surgió fue que la interfaz de usuario de red que tenía el programa no era compatible con WebSocket debido a un handshacking inicial y obligatorio que realiza el protocolo, por lo que se necesitaba otra interfaz de red. Se decidió entonces cambiar a JSON como formato para el intercambio de mensajes. Al final el proyecto comprende no sólo la aplicación web basada en Canvas sino también un servidor funcional y la definición de una nueva interfaz de usuario de red con su protocolo añadido. ABSTRACT. This project aims to become a part of the SSR tool to extend its capabilities in the field of the access. SSR is an acronym for SoundScape Renderer, is a program mostly written in C++ that allows you to hear already recorded or live sound with a variety of sound equipment as if the sound came from a desired place in the space. Like the web-page of the SSR says surely better explained: “The SoundScape Renderer (SSR) is a tool for real-time spatial audio reproduction providing a variety of rendering algorithms.” The application can be used with a graphical interface written in Qt but has also a network interface for external applications to use it. This network interface communicates using XML messages. A good example of it is the Android client. This Android client is already working. In order to use the application should be run it by loading an audio source and the wanted environment so that the renderer knows what to do. In that moment the server binds and anyone can use the network interface. Since the network interface is documented everyone can make an application to interact with this network interface. So the application can have as many user interfaces as wanted. The part that is developed in this project has nothing to do neither with audio rendering nor even with the reproduction of the spatial audio. The part that is developed here is about the interface used in the SSR application. As it can be deduced from the title: “Distributed Web Interface for Real-Time Spatial Audio Reproduction System”, this work aims only to offer the interface via web for the SSR (“Real-Time Spatial Audio Reproduction System”). The idea is not to make a new graphical interface for SSR but to allow more types of interfaces and communication. To accomplish the objective of allowing more graphical interfaces this project is going to use a new network interface. By now the SSR application is using only XML for data interchange but this new network interface support JSON. This project comprehends the server that launch the application, the user interface and the new network interface. It is done with these modules in order to allow creating new user interfaces that can communicate with the server or new servers that can communicate with the user interface by defining a complete network interface for data interchange.

Design, implementation and evaluation of an unconstrained and contactless biometric system based on hand geometry and stress detection

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Esta tesis propone un sistema biométrico de geometría de mano orientado a entornos sin contacto junto con un sistema de detección de estrés capaz de decir qué grado de estrés tiene una determinada persona en base a señales fisiológicas Con respecto al sistema biométrico, esta tesis contribuye con el diseño y la implementación de un sistema biométrico de geometría de mano, donde la adquisición se realiza sin ningún tipo de contacto, y el patrón del usuario se crea considerando únicamente datos del propio individuo. Además, esta tesis propone un algoritmo de segmentación multiescala para solucionar los problemas que conlleva la adquisición de manos en entornos reales. Por otro lado, respecto a la extracción de características y su posterior comparación esta tesis tiene una contribución específica, proponiendo esquemas adecuados para llevar a cabo tales tareas con un coste computacional bajo pero con una alta precisión en el reconocimiento de personas. Por último, este sistema es evaluado acorde a la norma estándar ISO/IEC 19795 considerando seis bases de datos públicas. En relación al método de detección de estrés, esta tesis propone un sistema basado en dos señales fisiológicas, concretamente la tasa cardiaca y la conductancia de la piel, así como la creación de un innovador patrón de estrés que recoge el comportamiento de ambas señales bajo las situaciones de estrés y no-estrés. Además, este sistema está basado en lógica difusa para decidir el grado de estrés de un individuo. En general, este sistema es capaz de detectar estrés de forma precisa y en tiempo real, proporcionando una solución adecuada para sistemas biométricos actuales, donde la aplicación del sistema de detección de estrés es directa para evitar situaciónes donde los individuos sean forzados a proporcionar sus datos biométricos. Finalmente, esta tesis incluye un estudio de aceptabilidad del usuario, donde se evalúa cuál es la aceptación del usuario con respecto a la técnica biométrica propuesta por un total de 250 usuarios. Además se incluye un prototipo implementado en un dispositivo móvil y su evaluación. ABSTRACT: This thesis proposes a hand biometric system oriented to unconstrained and contactless scenarios together with a stress detection method able to elucidate to what extent an individual is under stress based on physiological signals. Concerning the biometric system, this thesis contributes with the design and implementation of a hand-based biometric system, where the acquisition is carried out without contact and the template is created only requiring information from a single individual. In addition, this thesis proposes an algorithm based on multiscale aggregation in order to tackle with the problem of segmentation in real unconstrained environments. Furthermore, feature extraction and matching are also a specific contributions of this thesis, providing adequate schemes to carry out both actions with low computational cost but with certain recognition accuracy. Finally, this system is evaluated according to international standard ISO/IEC 19795 considering six public databases. In relation to the stress detection method, this thesis proposes a system based on two physiological signals, namely heart rate and galvanic skin response, with the creation of an innovative stress detection template which gathers the behaviour of both physiological signals under both stressing and non-stressing situations. Besides, this system is based on fuzzy logic to elucidate the level of stress of an individual. As an overview, this system is able to detect stress accurately and in real-time, providing an adequate solution for current biometric systems, where the application of a stress detection system is direct to avoid situations where individuals are forced to provide the biometric data. Finally, this thesis includes a user acceptability evaluation, where the acceptance of the proposed biometric technique is assessed by a total of 250 individuals. In addition, this thesis includes a mobile implementation prototype and its evaluation.

An architecture for a heterogeneous private IaaS management system

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Cloud computing and, more particularly, private IaaS, is seen as a mature technology with a myriad solutions tochoose from. However, this disparity of solutions and products has instilled in potential adopters the fear of vendor and data lock-in. Several competing and incompatible interfaces and management styles have given even more voice to these fears. On top of this, cloud users might want to work with several solutions at the same time, an integration that is difficult to achieve in practice. In this paper, we propose a management architecture that tries to tackle these problems; it offers a common way of managing several cloud solutions, and an interface that can be tailored to the needs of the user. This management architecture is designed in a modular way, and using a generic information model. We have validated our approach through the implementation of the components needed for this architecture to support a sample private IaaS solution: OpenStack

Business model with discount incentive in a P2P-cloud multimedia streaming system

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Today P2P faces two important challenges: design of mechanisms to encourage users' collaboration in multimedia live streaming services; design of reliable algorithms with QoS provision, to encourage the multimedia providers employ the P2P topology in commercial live streaming systems. We believe that these two challenges are tightly-related and there is much to be done with respect. This paper analyzes the effect of user behavior in a multi-tree P2P overlay and describes a business model based on monetary discount as incentive in a P2P-Cloud multimedia streaming system. We believe a discount model can boost up users' cooperation and loyalty and enhance the overall system integrity and performance. Moreover the model bounds the constraints for a provider's revenue and cost if the P2P system is leveraged on a cloud infrastructure. Our case study shows that a streaming system provider can establish or adapt his business model by applying the described bounds to achieve a good discount-revenue trade-off and promote the system to the users.

Incorporating proactivity to context-aware recommender systems for e-learning

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Recommender systems in e-learning have proved to be powerful tools to find suitable educational material during the learning experience. But traditional user request-response patterns are still being used to generate these recommendations. By including contextual information derived from the use of ubiquitous learning environments, the possibility of incorporating proactivity to the recommendation process has arisen. In this paper we describe methods to push proactive recommendations to e-learning systems users when the situation is appropriate without being needed their explicit request. As a result, interesting learning objects can be recommended attending to the user?s needs in every situation. The impact of this proactive recommendations generated have been evaluated among teachers and scientists in a real e-learning social network called Virtual Science Hub related to the GLOBAL excursion European project. Outcomes indicate that the methods proposed are valid to generate such kind of recommendations in e-learning scenarios. The results also show that the users' perceived appropriateness of having proactive recommendations is high.

Adapting a speech into sign language translation system to a new domain

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper presents a methodology for adapting an advanced communication system for deaf people in a new domain. This methodology is a user-centered design approach consisting of four main steps: requirement analysis, parallel corpus generation, technology adaptation to the new domain, and finally, system evaluation. In this paper, the new considered domain has been the dialogues in a hotel reception. With this methodology, it was possible to develop the system in a few months, obtaining very good performance: good speech recognition and translation rates (around 90%) with small processing times.

User Experience in Human-Technology Interaction. Communication, context and evaluation methodology

Relevância:

30.00% 30.00%

Publicador:

Resumo:

En esta Tesis se presentan dos líneas de investigación relacionadas y que contribuyen a las áreas de Interacción Hombre-Tecnología (o Máquina; siglas en inglés: HTI o HMI), lingüística computacional y evaluación de la experiencia del usuario. Las dos líneas en cuestión son el diseño y la evaluación centrada en el usuario de sistemas de Interacción Hombre-Máquina avanzados. En la primera parte de la Tesis (Capítulos 2 a 4) se abordan cuestiones fundamentales del diseño de sistemas HMI avanzados. El Capítulo 2 presenta una panorámica del estado del arte de la investigación en el ámbito de los sistemas conversacionales multimodales, con la que se enmarca el trabajo de investigación presentado en el resto de la Tesis. Los Capítulos 3 y 4 se centran en dos grandes aspectos del diseño de sistemas HMI: un gestor del diálogo generalizado para tratar la Interacción Hombre-Máquina multimodal y sensible al contexto, y el uso de agentes animados personificados (ECAs) para mejorar la robustez del diálogo, respectivamente. El Capítulo 3, sobre gestión del diálogo, aborda el tratamiento de la heterogeneidad de la información proveniente de las modalidades comunicativas y de los sensores externos. En este capítulo se propone, en un nivel de abstracción alto, una arquitectura para la gestión del diálogo con influjos heterogéneos de información, apoyándose en el uso de State Chart XML. En el Capítulo 4 se presenta una contribución a la representación interna de intenciones comunicativas, y su traducción a secuencias de gestos a ejecutar por parte de un ECA, diseñados específicamente para mejorar la robustez en situaciones de diálogo críticas que pueden surgir, por ejemplo, cuando se producen errores de entendimiento en la comunicación entre el usuario humano y la máquina. Se propone, en estas páginas, una extensión del Functional Mark-up Language definido en el marco conceptual SAIBA. Esta extensión permite representar actos comunicativos que realizan intenciones del emisor (la máquina) que no se pretende sean captadas conscientemente por el receptor (el usuario humano), pero con las que se pretende influirle a éste e influir el curso del diálogo. Esto se consigue mediante un objeto llamado Base de Intenciones Comunicativas (en inglés, Communication Intention Base, o CIB). La representación en el CIB de intenciones “no claradas” además de las explícitas permite la construcción de actos comunicativos que realizan simultáneamente varias intenciones comunicativas. En el Capítulo 4 también se describe un sistema experimental para el control remoto (simulado) de un asistente domótico, con autenticación de locutor para dar acceso, y con un ECA en el interfaz de cada una de estas tareas. Se incluye una descripción de las secuencias de comportamiento verbal y no verbal de los ECAs, que fueron diseñados específicamente para determinadas situaciones con objeto de mejorar la robustez del diálogo. Los Capítulos 5 a 7 conforman la parte de la Tesis dedicada a la evaluación. El Capítulo 5 repasa antecedentes relevantes en la literatura de tecnologías de la información en general, y de sistemas de interacción hablada en particular. Los principales antecedentes en el ámbito de la evaluación de la interacción sobre los cuales se ha desarrollado el trabajo presentado en esta Tesis son el Technology Acceptance Model (TAM), la herramienta Subjective Assessment of Speech System Interfaces (SASSI), y la Recomendación P.851 de la ITU-T. En el Capítulo 6 se describen un marco y una metodología de evaluación aplicados a la experiencia del usuario con sistemas HMI multimodales. Se desarrolló con este propósito un novedoso marco de evaluación subjetiva de la calidad de la experiencia del usuario y su relación con la aceptación por parte del mismo de la tecnología HMI (el nombre dado en inglés a este marco es Subjective Quality Evaluation Framework). En este marco se articula una estructura de clases de factores subjetivos relacionados con la satisfacción y aceptación por parte del usuario de la tecnología HMI propuesta. Esta estructura, tal y como se propone en la presente tesis, tiene dos dimensiones ortogonales. Primero se identifican tres grandes clases de parámetros relacionados con la aceptación por parte del usuario: “agradabilidad ” (likeability: aquellos que tienen que ver con la experiencia de uso, sin entrar en valoraciones de utilidad), rechazo (los cuales sólo pueden tener una valencia negativa) y percepción de utilidad. En segundo lugar, este conjunto clases se reproduce para distintos “niveles, o focos, percepción del usuario”. Éstos incluyen, como mínimo, un nivel de valoración global del sistema, niveles correspondientes a las tareas a realizar y objetivos a alcanzar, y un nivel de interfaz (en los casos propuestos en esta tesis, el interfaz es un sistema de diálogo con o sin un ECA). En el Capítulo 7 se presenta una evaluación empírica del sistema descrito en el Capítulo 4. El estudio se apoya en los mencionados antecedentes en la literatura, ampliados con parámetros para el estudio específico de los agentes animados (los ECAs), la auto-evaluación de las emociones de los usuarios, así como determinados factores de rechazo (concretamente, la preocupación por la privacidad y la seguridad). También se evalúa el marco de evaluación subjetiva de la calidad propuesto en el capítulo anterior. Los análisis de factores efectuados revelan una estructura de parámetros muy cercana conceptualmente a la división de clases en utilidad-agradabilidad-rechazo propuesta en dicho marco, resultado que da cierta validez empírica al marco. Análisis basados en regresiones lineales revelan estructuras de dependencias e interrelación entre los parámetros subjetivos y objetivos considerados. El efecto central de mediación, descrito en el Technology Acceptance Model, de la utilidad percibida sobre la relación de dependencia entre la intención de uso y la facilidad de uso percibida, se confirma en el estudio presentado en la presente Tesis. Además, se ha encontrado que esta estructura de relaciones se fortalece, en el estudio concreto presentado en estas páginas, si las variables consideradas se generalizan para cubrir más ampliamente las categorías de agradabilidad y utilidad contempladas en el marco de evaluación subjetiva de calidad. Se ha observado, asimismo, que los factores de rechazo aparecen como un componente propio en los análisis de factores, y además se distinguen por su comportamiento: moderan la relación entre la intención de uso (que es el principal indicador de la aceptación del usuario) y su predictor más fuerte, la utilidad percibida. Se presentan también resultados de menor importancia referentes a los efectos de los ECAs sobre los interfaces de los sistemas de diálogo y sobre los parámetros de percepción y las valoraciones de los usuarios que juegan un papel en conformar su aceptación de la tecnología. A pesar de que se observa un rendimiento de la interacción dialogada ligeramente mejor con ECAs, las opiniones subjetivas son muy similares entre los dos grupos experimentales (uno interactuando con un sistema de diálogo con ECA, y el otro sin ECA). Entre las pequeñas diferencias encontradas entre los dos grupos destacan las siguientes: en el grupo experimental sin ECA (es decir, con interfaz sólo de voz) se observó un efecto más directo de los problemas de diálogo (por ejemplo, errores de reconocimiento) sobre la percepción de robustez, mientras que el grupo con ECA tuvo una respuesta emocional más positiva cuando se producían problemas. Los ECAs parecen generar inicialmente expectativas más elevadas en cuanto a las capacidades del sistema, y los usuarios de este grupo se declaran más seguros de sí mismos en su interacción. Por último, se observan algunos indicios de efectos sociales de los ECAs: la “amigabilidad ” percibida los ECAs estaba correlada con un incremento la preocupación por la seguridad. Asimismo, los usuarios del sistema con ECAs tendían más a culparse a sí mismos, en lugar de culpar al sistema, de los problemas de diálogo que pudieran surgir, mientras que se observó una ligera tendencia opuesta en el caso de los usuarios del sistema con interacción sólo de voz. ABSTRACT This Thesis presents two related lines of research work contributing to the general fields of Human-Technology (or Machine) Interaction (HTI, or HMI), computational linguistics, and user experience evaluation. These two lines are the design and user-focused evaluation of advanced Human-Machine (or Technology) Interaction systems. The first part of the Thesis (Chapters 2 to 4) is centred on advanced HMI system design. Chapter 2 provides a background overview of the state of research in multimodal conversational systems. This sets the stage for the research work presented in the rest of the Thesis. Chapers 3 and 4 focus on two major aspects of HMI design in detail: a generalised dialogue manager for context-aware multimodal HMI, and embodied conversational agents (ECAs, or animated agents) to improve dialogue robustness, respectively. Chapter 3, on dialogue management, deals with how to handle information heterogeneity, both from the communication modalities or from external sensors. A highly abstracted architectural contribution based on State Chart XML is proposed. Chapter 4 presents a contribution for the internal representation of communication intentions and their translation into gestural sequences for an ECA, especially designed to improve robustness in critical dialogue situations such as when miscommunication occurs. We propose an extension of the functionality of Functional Mark-up Language, as envisaged in much of the work in the SAIBA framework. Our extension allows the representation of communication acts that carry intentions that are not for the interlocutor to know of, but which are made to influence him or her as well as the flow of the dialogue itself. This is achieved through a design element we have called the Communication Intention Base. Such r pr s ntation of “non- clar ” int ntions allows th construction of communication acts that carry several communication intentions simultaneously. Also in Chapter 4, an experimental system is described which allows (simulated) remote control to a home automation assistant, with biometric (speaker) authentication to grant access, featuring embodied conversation agents for each of the tasks. The discussion includes a description of the behavioural sequences for the ECAs, which were designed for specific dialogue situations with particular attention given to the objective of improving dialogue robustness. Chapters 5 to 7 form the evaluation part of the Thesis. Chapter 5 reviews evaluation approaches in the literature for information technologies, as well as in particular for speech-based interaction systems, that are useful precedents to the contributions of the present Thesis. The main evaluation precedents on which the work in this Thesis has built are the Technology Acceptance Model (TAM), the Subjective Assessment of Speech System Interfaces (SASSI) tool, and ITU-T Recommendation P.851. Chapter 6 presents the author’s work in establishing an valuation framework and methodology applied to the users’ experience with multimodal HMI systems. A novel user-acceptance Subjective Quality Evaluation Framework was developed by the author specifically for this purpose. A class structure arises from two orthogonal sets of dimensions. First we identify three broad classes of parameters related with user acceptance: likeability factors (those that have to do with the experience of using the system), rejection factors (which can only have a negative valence) and perception of usefulness. Secondly, the class structure is further broken down into several “user perception levels”; at the very least: an overall system-assessment level, task and goal-related levels, and an interface level (e.g., a dialogue system with or without an ECA). An empirical evaluation of the system described in Chapter 4 is presented in Chapter 7. The study was based on the abovementioned precedents in the literature, expanded with categories covering the inclusion of an ECA, the users’ s lf-assessed emotions, and particular rejection factors (privacy and security concerns). The Subjective Quality Evaluation Framework proposed in the previous chapter was also scrutinised. Factor analyses revealed an item structure very much related conceptually to the usefulness-likeability-rejection class division introduced above, thus giving it some empirical weight. Regression-based analysis revealed structures of dependencies, paths of interrelations, between the subjective and objective parameters considered. The central mediation effect, in the Technology Acceptance Model, of perceived usefulness on the dependency relationship of intention-to-use with perceived ease of use was confirmed in this study. Furthermore, the pattern of relationships was stronger for variables covering more broadly the likeability and usefulness categories in the Subjective Quality Evaluation Framework. Rejection factors were found to have a distinct presence as components in factor analyses, as well as distinct behaviour: they were found to moderate the relationship between intention-to-use (the main measure of user acceptance) and its strongest predictor, perceived usefulness. Insights of secondary importance are also given regarding the effect of ECAs on the interface of spoken dialogue systems and the dimensions of user perception and judgement attitude that may have a role in determining user acceptance of the technology. Despite observing slightly better performance values in the case of the system with the ECA, subjective opinions regarding both systems were, overall, very similar. Minor differences between two experimental groups (one interacting with an ECA, the other only through speech) include a more direct effect of dialogue problems (e.g., non-understandings) on perceived dialogue robustness for the voice-only interface test group, and a more positive emotional response for the ECA test group. Our findings further suggest that the ECA generates higher initial expectations, and users seem slightly more confident in their interaction with the ECA than do those without it. Finally, mild evidence of social effects of ECAs was also found: the perceived friendliness of the ECA increased security concerns, and ECA users may tend to blame themselves rather than the system when dialogue problems are encountered, while the opposite may be true for voice-only users.

«
1
2
3
4
5
»