La modélisation de l’expérience de l’utilisateur dans les Interactions Homme-Machine est un enjeu important pour la conception et le développement des systèmes adaptatifs intelligents. Dans ce contexte, une attention particulière est portée sur les réactions émotionnelles de l’utilisateur, car elles ont une influence capitale sur ses aptitudes cognitives, comme la perception et la prise de décision. La modélisation des émotions est particulièrement pertinente pour les Systèmes Tutoriels Émotionnellement Intelligents (STEI). Ces systèmes cherchent à identifier les émotions de l’apprenant lors des sessions d’apprentissage, et à optimiser son expérience d’interaction en recourant à diverses stratégies d’interventions. Cette thèse vise à améliorer les méthodes de modélisation des émotions et les stratégies émotionnelles utilisées actuellement par les STEI pour agir sur les émotions de l’apprenant. Plus précisément, notre premier objectif a été de proposer une nouvelle méthode pour détecter l’état émotionnel de l’apprenant, en utilisant différentes sources d’informations qui permettent de mesurer les émotions de façon précise, tout en tenant compte des variables individuelles qui peuvent avoir un impact sur la manifestation des émotions. Pour ce faire, nous avons développé une approche multimodale combinant plusieurs mesures physiologiques (activité cérébrale, réactions galvaniques et rythme cardiaque) avec des variables individuelles, pour détecter une émotion très fréquemment observée lors des sessions d’apprentissage, à savoir l’incertitude. Dans un premier lieu, nous avons identifié les indicateurs physiologiques clés qui sont associés à cet état, ainsi que les caractéristiques individuelles qui contribuent à sa manifestation. Puis, nous avons développé des modèles prédictifs permettant de détecter automatiquement cet état à partir des différentes variables analysées, à travers l’entrainement d’algorithmes d’apprentissage machine. Notre deuxième objectif a été de proposer une approche unifiée pour reconnaître simultanément une combinaison de plusieurs émotions, et évaluer explicitement l’impact de ces émotions sur l’expérience d’interaction de l’apprenant. Pour cela, nous avons développé une plateforme hiérarchique, probabiliste et dynamique permettant de suivre les changements émotionnels de l'apprenant au fil du temps, et d’inférer automatiquement la tendance générale qui caractérise son expérience d’interaction à savoir : l’immersion, le blocage ou le décrochage. L’immersion correspond à une expérience optimale : un état dans lequel l'apprenant est complètement concentré et impliqué dans l’activité d’apprentissage. L’état de blocage correspond à une tendance d’interaction non optimale où l'apprenant a de la difficulté à se concentrer. Finalement, le décrochage correspond à un état extrêmement défavorable où l’apprenant n’est plus du tout impliqué dans l’activité d’apprentissage. La plateforme proposée intègre trois modalités de variables diagnostiques permettant d’évaluer l’expérience de l’apprenant à savoir : des variables physiologiques, des variables comportementales, et des mesures de performance, en combinaison avec des variables prédictives qui représentent le contexte courant de l’interaction et les caractéristiques personnelles de l'apprenant. Une étude a été réalisée pour valider notre approche à travers un protocole expérimental permettant de provoquer délibérément les trois tendances ciblées durant l’interaction des apprenants avec différents environnements d’apprentissage. Enfin, notre troisième objectif a été de proposer de nouvelles stratégies pour influencer positivement l’état émotionnel de l’apprenant, sans interrompre la dynamique de la session d’apprentissage. Nous avons à cette fin introduit le concept de stratégies émotionnelles implicites : une nouvelle approche pour agir subtilement sur les émotions de l’apprenant, dans le but d’améliorer son expérience d’apprentissage. Ces stratégies utilisent la perception subliminale, et plus précisément une technique connue sous le nom d’amorçage affectif. Cette technique permet de solliciter inconsciemment les émotions de l’apprenant, à travers la projection d’amorces comportant certaines connotations affectives. Nous avons mis en œuvre une stratégie émotionnelle implicite utilisant une forme particulière d’amorçage affectif à savoir : le conditionnement évaluatif, qui est destiné à améliorer de façon inconsciente l’estime de soi. Une étude expérimentale a été réalisée afin d’évaluer l’impact de cette stratégie sur les réactions émotionnelles et les performances des apprenants.


We investigate solution sets of a special kind of linear inequality systems. In particular, we derive characterizations of these sets in terms of minimal solution sets. The studied inequalities emerge as information inequalities in the context of Bayesian networks. This allows to deduce important properties of Bayesian networks, which is important within causal inference.


The paper discusses maintenance challenges of organisations with a huge number of devices and proposes the use of probabilistic models to assist monitoring and maintenance planning. The proposal assumes connectivity of instruments to report relevant features for monitoring. Also, the existence of enough historical registers with diagnosed breakdowns is required to make probabilistic models reliable and useful for predictive maintenance strategies based on them. Regular Markov models based on estimated failure and repair rates are proposed to calculate the availability of the instruments and Dynamic Bayesian Networks are proposed to model cause-effect relationships to trigger predictive maintenance services based on the influence between observed features and previously documented diagnostics


Many weeds occur in patches but farmers frequently spray whole fields to control the weeds in these patches. Given a geo-referenced weed map, technology exists to confine spraying to these patches. Adoption of patch spraying by arable farmers has, however, been negligible partly due to the difficulty of constructing weed maps. Building on previous DEFRA and HGCA projects, this proposal aims to develop and evaluate a machine vision system to automate the weed mapping process. The project thereby addresses the principal technical stumbling block to widespread adoption of site specific weed management (SSWM). The accuracy of weed identification by machine vision based on a single field survey may be inadequate to create herbicide application maps. We therefore propose to test the hypothesis that sufficiently accurate weed maps can be constructed by integrating information from geo-referenced images captured automatically at different times of the year during normal field activities. Accuracy of identification will also be increased by utilising a priori knowledge of weeds present in fields. To prove this concept, images will be captured from arable fields on two farms and processed offline to identify and map the weeds, focussing especially on black-grass, wild oats, barren brome, couch grass and cleavers. As advocated by Lutman et al. (2002), the approach uncouples the weed mapping and treatment processes and builds on the observation that patches of these weeds are quite stable in arable fields. There are three main aspects to the project. 1) Machine vision hardware. Hardware component parts of the system are one or more cameras connected to a single board computer (Concurrent Solutions LLC) and interfaced with an accurate Global Positioning System (GPS) supplied by Patchwork Technology. The camera(s) will take separate measurements for each of the three primary colours of visible light (red, green and blue) in each pixel. The basic proof of concept can be achieved in principle using a single camera system, but in practice systems with more than one camera may need to be installed so that larger fractions of each field can be photographed. Hardware will be reviewed regularly during the project in response to feedback from other work packages and updated as required. 2) Image capture and weed identification software. The machine vision system will be attached to toolbars of farm machinery so that images can be collected during different field operations. Images will be captured at different ground speeds, in different directions and at different crop growth stages as well as in different crop backgrounds. Having captured geo-referenced images in the field, image analysis software will be developed to identify weed species by Murray State and Reading Universities with advice from The Arable Group. A wide range of pattern recognition and in particular Bayesian Networks will be used to advance the state of the art in machine vision-based weed identification and mapping. Weed identification algorithms used by others are inadequate for this project as we intend to collect and correlate images collected at different growth stages. Plants grown for this purpose by Herbiseed will be used in the first instance. In addition, our image capture and analysis system will include plant characteristics such as leaf shape, size, vein structure, colour and textural pattern, some of which are not detectable by other machine vision systems or are omitted by their algorithms. Using such a list of features observable using our machine vision system, we will determine those that can be used to distinguish weed species of interest. 3) Weed mapping. Geo-referenced maps of weeds in arable fields (Reading University and Syngenta) will be produced with advice from The Arable Group and Patchwork Technology. Natural infestations will be mapped in the fields but we will also introduce specimen plants in pots to facilitate more rigorous system evaluation and testing. Manual weed maps of the same fields will be generated by Reading University, Syngenta and Peter Lutman so that the accuracy of automated mapping can be assessed. The principal hypothesis and concept to be tested is that by combining maps from several surveys, a weed map with acceptable accuracy for endusers can be produced. If the concept is proved and can be commercialised, systems could be retrofitted at low cost onto existing farm machinery. The outputs of the weed mapping software would then link with the precision farming options already built into many commercial sprayers, allowing their use for targeted, site-specific herbicide applications. Immediate economic benefits would, therefore, arise directly from reducing herbicide costs. SSWM will also reduce the overall pesticide load on the crop and so may reduce pesticide residues in food and drinking water, and reduce adverse impacts of pesticides on non-target species and beneficials. Farmers may even choose to leave unsprayed some non-injurious, environmentally-beneficial, low density weed infestations. These benefits fit very well with the anticipated legislation emerging in the new EU Thematic Strategy for Pesticides which will encourage more targeted use of pesticides and greater uptake of Integrated Crop (Pest) Management approaches, and also with the requirements of the Water Framework Directive to reduce levels of pesticides in water bodies. The greater precision of weed management offered by SSWM is therefore a key element in preparing arable farming systems for the future, where policy makers and consumers want to minimise pesticide use and the carbon footprint of farming while maintaining food production and security. The mapping technology could also be used on organic farms to identify areas of fields needing mechanical weed control thereby reducing both carbon footprints and also damage to crops by, for example, spring tines. Objective i. To develop a prototype machine vision system for automated image capture during agricultural field operations; ii. To prove the concept that images captured by the machine vision system over a series of field operations can be processed to identify and geo-reference specific weeds in the field; iii. To generate weed maps from the geo-referenced, weed plants/patches identified in objective (ii).


A situation assessment uses reports from sensors to produce hypotheses about a situation at a level of aggregation that is of direct interest to a military commander. A low level of aggregation could mean forming tracks from reports, which is well documented in the tracking literature as track initiation and data association. In this paper there is also discussion on higher level aggregation; assessing the membership of tracks to larger groups. Ideas used in joint tracking and identification are extended, using multi-entity Bayesian networks to model a number of static variables, of which the identity of a target is one. For higher level aggregation a scheme for hypothesis management is required. It is shown how an offline clustering of vehicles can be reduced to an assignment problem.


The substitution of missing values, also called imputation, is an important data preparation task for many domains. Ideally, the substitution of missing values should not insert biases into the dataset. This aspect has been usually assessed by some measures of the prediction capability of imputation methods. Such measures assume the simulation of missing entries for some attributes whose values are actually known. These artificially missing values are imputed and then compared with the original values. Although this evaluation is useful, it does not allow the influence of imputed values in the ultimate modelling task (e.g. in classification) to be inferred. We argue that imputation cannot be properly evaluated apart from the modelling task. Thus, alternative approaches are needed. This article elaborates on the influence of imputed values in classification. In particular, a practical procedure for estimating the inserted bias is described. As an additional contribution, we have used such a procedure to empirically illustrate the performance of three imputation methods (majority, naive Bayes and Bayesian networks) in three datasets. Three classifiers (decision tree, naive Bayes and nearest neighbours) have been used as modelling tools in our experiments. The achieved results illustrate a variety of situations that can take place in the data preparation practice.


A crucial aspect of evidential reasoning in crime investigation involves comparing the support that evidence provides for alternative hypotheses. Recent work in forensic statistics has shown how Bayesian Networks (BNs) can be employed for this purpose. However, the specification of BNs requires conditional probability tables describing the uncertain processes under evaluation. When these processes are poorly understood, it is necessary to rely on subjective probabilities provided by experts. Accurate probabilities of this type are normally hard to acquire from experts. Recent work in qualitative reasoning has developed methods to perform probabilistic reasoning using coarser representations. However, the latter types of approaches are too imprecise to compare the likelihood of alternative hypotheses. This paper examines this shortcoming of the qualitative approaches when applied to the aforementioned problem, and identifies and integrates techniques to refine them.


Discovering a precise causal structure accurately reflecting the given data is one of the most essential tasks in the area of data mining and machine learning. One of the successful causal discovery approaches is the information-theoretic approach using the Minimum Message Length Principle[19]. This paper presents an improved and further experimental results of the MML discovery algorithm. We introduced a new encoding scheme for measuring the cost of describing the causal structure. Stiring function is also applied to further simplify the computational complexity and thus works more efficiently. The experimental results of the current version of the discovery system show that: (1) the current version is capable of discovering what discovered by previous system; (2) current system is capable of discovering more complicated causal models with large number of variables; (3) the new version works more efficiently compared with the previous version in terms of time complexity.


This paper presents an examination report on the performance of the improved MML based causal model discovery algorithm. In this paper, We firstly describe our improvement to the causal discovery algorithm which introduces a new encoding scheme for measuring the cost of describing the causal structure. Stiring function is also applied to further simplify the computational complexity and thus works more efficiently. It is followed by a detailed examination report on the performance of our improved discovery algorithm. The experimental results of the current version of the discovery system show that: (l) the current version is capable of discovering what discovered by previous system; (2) current system is capable of discovering more complicated causal networks with large number of variables; (3) the new version works more efficiently compared with the previous version in terms of time complexity.


The AMP-activated protein kinase (AMPK) acts as a metabolic master switch regulating several intracellular systems. The effect of AMPK on muscle cellular energy status makes this protein a promising pharmacological target for disease treatment. With increasingly available AMPK regulation data, it is critical to develop an efficient way to analyze the data since this assists in further understanding AMPK pathways. Bayesian networks can play an important role in expressing the dependency and causality in the data. This paper aims to analyse the regulation data using B-Course, a powerful analysis tool to exploit several theoretically elaborate results in the fields of Bayesian and causal modelling, and discover a certain type of multivariate probabilistic dependencies. The identified dependency models are easier to understand in comparison with the traditional frequent patterns.


In this paper, we consider the problem of tracking an object and predicting the object's future trajectory in a wide-area environment, with complex spatial layout and the use of multiple sensors/cameras. To solve this problem, there is a need for representing the dynamic and noisy data in the tracking tasks, and dealing with them at different levels of detail. We employ the Abstract Hidden Markov Models (AHMM), an extension of the well-known Hidden Markov Model (HMM) and a special type of Dynamic Probabilistic Network (DPN), as our underlying representation framework. The AHMM allows us to explicitly encode the hierarchy of connected spatial locations, making it scalable to the size of the environment being modeled. We describe an application for tracking human movement in an office-like spatial layout where the AHMM is used to track and predict the evolution of object trajectories at different levels of detail.


Abstraction plays an essential role in the way the agents plan their behaviours, especially to reduce the computational complexity of planning in large domains. However, the effects of abstraction in the inverse process – plan recognition – are unclear. In this paper, we present a method for recognising the agent’s behaviour in noisy and uncertain domains, and across multiple levels of abstraction. We use the concept of abstract Markov policies in abstract probabilistic planning as the model of the agent’s behaviours and employ probabilistic inference in Dynamic Bayesian Networks (DBN) to infer the correct policy from a sequence of observations. When the states are fully observable, we show that for a broad and often-used class of abstract policies, the complexity of policy recognition scales well with the number of abstraction levels in the policy hierarchy. For the partially observable case, we derive an efficient hybrid inference scheme on the corresponding DBN to overcome the exponential complexity.


Performance in triathlon is dependent upon factors that include somatotype, physiological capacity, technical proficiency and race strategy. Given the multidisciplinary nature of triathlon and the interaction between each of the three race components, the identification of target split times that can be used to inform the design of training plans and race pacing strategies is a complex task. The present study uses machine learning techniques to analyse a large database of performances in Olympic distance triathlons (2008–2012). The analysis reveals patterns of performance in five components of triathlon (three race “legs” and two transitions) and the complex relationships between performance in each component and overall performance in a race. The results provide three perspectives on the relationship between performance in each component of triathlon and the final placing in a race. These perspectives allow the identification of target split times that are required to achieve a certain final place in a race and the opportunity to make evidence-based decisions about race tactics in order to optimise performance.


O objetivo deste trabalho é testar a aplicação de um modelo gráfico probabilístico, denominado genericamente de Redes Bayesianas, para desenvolver modelos computacionais que possam ser utilizados para auxiliar a compreensão de problemas e/ou na previsão de variáveis de natureza econômica. Com este propósito, escolheu-se um problema amplamente abordado na literatura e comparou-se os resultados teóricos e experimentais já consolidados com os obtidos utilizando a técnica proposta. Para tanto,foi construído um modelo para a classificação da tendência do "risco país" para o Brasil a partir de uma base de dados composta por variáveis macroeconômicas e financeiras. Como medida do risco adotou-se o EMBI+ (Emerging Markets Bond Index Plus), por ser um indicador amplamente utilizado pelo mercado.


Durante os últimos anos as áreas de pesquisa sobre Agentes Inteligentes, Sistemas Multiagentes e Comunicação entre Agentes têm contribuído com uma revolução na forma como sistemas inteligentes podem ser concebidos, fundamentados e construídos. Sendo assim, parece razoável supor que sistemas inteligentes que trabalhem com domínios probabilísticos de conhecimento possam compartilhar do mesmo tipo de benefícios que os sistemas mais tradicionais da Inteligência Artificial receberam quando adotaram as concepções de agência, de sistemas compostos de múltiplos agentes e de linguagens de comunicação entre estes agentes. Porém, existem dúvidas não só sobre como se poderia escalar efetivamente um sistema probabilístico para uma arquitetura multiagente, mas como se poderia lidar com as questões relativas à comunicação e à representação de conhecimentos probabilísticos neste tipo de sistema, principalmente tendo em vista as limitações das linguagens de comunicação entre agentes atuais, que não permitem comunicar ou representar este tipo de conhecimento. Este trabalho parte destas considerações e propõe uma generalização do modelo teórico puramente lógico que atualmente fundamenta a comunicação nos sistemas multiagentes, que será capaz de representar conhecimentos probabilísticos. Também é proposta neste trabalho uma extensão das linguagens de comunicação atuais, que será capaz de suportar as necessidades de comunicação de conhecimentos de natureza probabilísticas. São demonstradas as propriedades de compatibilidade do novo modelo lógico-probabilístico com o modelo puramente lógico atual, sendo demonstrado que teoremas válidos no modelo atual continuam válidos no novo modelo. O novo modelo é definido como uma lógica probabilística que estende a lógica modal dos modelos atuais. Para esta lógica probabilística é definido um sistema axiomático e são demonstradas sua correção e completude. A completude é demonstrada de forma relativa: se o sistema axiomático da lógica modal original for completo, então o sistema axiomático da lógica probabilística proposta como extensão também será completo. A linguagem de comunicação proposta neste trabalho é definida formalmente pela generalização das teorias axiomáticas de agência e comunicação atuais para lidar com a comunicação de conhecimentos probabilísticos e pela definição de novos atos comunicativos específicos para este tipo de comunicação. Demonstra-se que esta linguagem é compatível com as linguagens atuais no caso não-probabilístico. Também é definida uma nova linguagem para representação de conteúdos de atos de comunicação, baseada na lógica probabilística usada como modelo semântico, que será capaz de expressar conhecimentos probabilísticos e não probabilísticos de uma maneira uniforme. O grau de expressibilidade destas linguagens é verificado por meio de duas aplicações. Na primeira aplicação demonstra-se como a nova linguagem de conteúdos pode ser utilizada para representar conhecimentos probabilísticos expressos através da forma de representação de conhecimentos probabilísticos mais aceita atualmente, que são as Redes Bayesianas ou Redes de Crenças Probabilísticas. Na outra aplicação, são propostos protocolos de interação, baseados nos novos atos comunicativos, que são capazes de atender as necessidades de comunicação das operações de consistência de Redes Bayesianas secionadas (MSBNs, Multiple Sectioned Bayesian Networks) para o caso de sistemas multiagentes.