817 resultados para multi-agent learning
Resumo:
This paper presents a decision support methodology for electricity market players’ bilateral contract negotiations. The proposed model is based on the application of game theory, using artificial intelligence to enhance decision support method’s adaptive features. This model is integrated in AiD-EM (Adaptive Decision Support for Electricity Markets Negotiations), a multi-agent system that provides electricity market players with strategic behavior capabilities to improve their outcomes from energy contracts’ negotiations. Although a diversity of tools that enable the study and simulation of electricity markets has emerged during the past few years, these are mostly directed to the analysis of market models and power systems’ technical constraints, making them suitable tools to support decisions of market operators and regulators. However, the equally important support of market negotiating players’ decisions is being highly neglected. The proposed model contributes to overcome the existing gap concerning effective and realistic decision support for electricity market negotiating entities. The proposed method is validated by realistic electricity market simulations using real data from the Iberian market operator—MIBEL. Results show that the proposed adaptive decision support features enable electricity market players to improve their outcomes from bilateral contracts’ negotiations.
Resumo:
The explosive growth of Internet during the last years has been reflected in the ever-increasing amount of the diversity and heterogeneity of user preferences, types and features of devices and access networks. Usually the heterogeneity in the context of the users which request Web contents is not taken into account by the servers that deliver them implying that these contents will not always suit their needs. In the particular case of e-learning platforms this issue is especially critical due to the fact that it puts at stake the knowledge acquired by their users. In the following paper we present a system that aims to provide the dotLRN e-learning platform with the capability to adapt to its users context. By integrating dotLRN with a multi-agent hypermedia system, online courses being undertaken by students as well as their learning environment are adapted in real time
Resumo:
Mémoire numérisé par la Division de la gestion de documents et des archives de l'Université de Montréal
Resumo:
One objective of artificial intelligence is to model the behavior of an intelligent agent interacting with its environment. The environment's transformations can be modeled as a Markov chain, whose state is partially observable to the agent and affected by its actions; such processes are known as partially observable Markov decision processes (POMDPs). While the environment's dynamics are assumed to obey certain rules, the agent does not know them and must learn. In this dissertation we focus on the agent's adaptation as captured by the reinforcement learning framework. This means learning a policy---a mapping of observations into actions---based on feedback from the environment. The learning can be viewed as browsing a set of policies while evaluating them by trial through interaction with the environment. The set of policies is constrained by the architecture of the agent's controller. POMDPs require a controller to have a memory. We investigate controllers with memory, including controllers with external memory, finite state controllers and distributed controllers for multi-agent systems. For these various controllers we work out the details of the algorithms which learn by ascending the gradient of expected cumulative reinforcement. Building on statistical learning theory and experiment design theory, a policy evaluation algorithm is developed for the case of experience re-use. We address the question of sufficient experience for uniform convergence of policy evaluation and obtain sample complexity bounds for various estimators. Finally, we demonstrate the performance of the proposed algorithms on several domains, the most complex of which is simulated adaptive packet routing in a telecommunication network.
Resumo:
The explosive growth of Internet during the last years has been reflected in the ever-increasing amount of the diversity and heterogeneity of user preferences, types and features of devices and access networks. Usually the heterogeneity in the context of the users which request Web contents is not taken into account by the servers that deliver them implying that these contents will not always suit their needs. In the particular case of e-learning platforms this issue is especially critical due to the fact that it puts at stake the knowledge acquired by their users. In the following paper we present a system that aims to provide the dotLRN e-learning platform with the capability to adapt to its users context. By integrating dotLRN with a multi-agent hypermedia system, online courses being undertaken by students as well as their learning environment are adapted in real time
Resumo:
En esta tesis se propone el uso de agentes inteligentes en entornos de aprendizaje en línea con el fin de mejorar la asistencia y motivación del estudiante a través de contenidos personalizados que tienen en cuenta el estilo de aprendizaje del estudiante y su nivel de conocimiento. Los agentes propuestos se desempeñan como asistentes personales que ayudan al estudiante a llevar a cabo las actividades de aprendizaje midiendo su progreso y motivación. El entorno de agentes se construye a través de una arquitectura multiagente llamada MASPLANG diseñada para dar soporte adaptativo (presentación y navegación adaptativa) a un sistema hipermedia educativo desarrollado en la Universitat de Girona para impartir educación virtual a través del web. Un aspecto importante de esta propuesta es la habilidad de construir un modelo de estudiante híbrido que comienza con un modelo estereotípico del estudiante basado en estilos de aprendizaje y se modifica gradualmente a medida que el estudiante interactúa con el sistema (gustos subjetivos). Dentro del contexto de esta tesis, el aprendizaje se define como el proceso interno que, bajo factores de cambio resulta en la adquisición de la representación interna de un conocimiento o de una actitud. Este proceso interno no se puede medir directamente sino a través de demostraciones observables externas que constituyen el comportamiento relacionado con el objeto de conocimiento. Finalmente, este cambio es el resultado de la experiencia o entrenamiento y tiene una durabilidad que depende de factores como la motivación y el compromiso. El MASPLANG está compuesto por dos niveles de agentes: los intermediarios llamados IA (agentes de información) que están en el nivel inferior y los de Interfaz llamados PDA (agentes asistentes) que están en el nivel superior. Los agentes asistentes atienden a los estudiantes cuando trabajan con el material didáctico de un curso o una lección de aprendizaje. Esta asistencia consiste en la recolección y análisis de las acciones de los estudiantes para ofrecer contenidos personalizados y en la motivación del estudiante durante el aprendizaje mediante el ofrecimiento de contenidos de retroalimentación, ejercicios adaptados al nivel de conocimiento y mensajes, a través de interfaces de usuario animadas y atractivas. Los agentes de información se encargan del mantenimiento de los modelos pedagógico y del dominio y son los que están en completa interacción con las bases de datos del sistema (compendio de actividades del estudiante y modelo del dominio). El escenario de funcionamiento del MASPLANG está definido por el tipo de usuarios y el tipo de contenidos que ofrece. Como su entorno es un sistema hipermedia educativo, los usuarios se clasifican en profesores quienes definen y preparan los contenidos para el aprendizaje adaptativo, y los estudiantes quienes llevan a cabo las actividades de aprendizaje de forma personalizada. El perfil de aprendizaje inicial del estudiante se captura a través de la evaluación del cuestionario ILS (herramienta de diagnóstico del modelo FSLSM de estilos de aprendizaje adoptado para este estudio) que se asigna al estudiante en su primera interacción con el sistema. Este cuestionario consiste en un conjunto de preguntas de naturaleza sicológica cuyo objetivo es determinar los deseos, hábitos y reacciones del estudiante que orientarán la personalización de los contenidos y del entorno de aprendizaje. El modelo del estudiante se construye entonces teniendo en cuenta este perfil de aprendizaje y el nivel de conocimiento obtenido mediante el análisis de las acciones del estudiante en el entorno.
Resumo:
This thesis addresses the problem of learning in physical heterogeneous multi-agent systems (MAS) and the analysis of the benefits of using heterogeneous MAS with respect to homogeneous ones. An algorithm is developed for this task; building on a previous work on stability in distributed systems by Tad Hogg and Bernardo Huberman, and combining two phenomena observed in natural systems, task partition and hierarchical dominance. This algorithm is devised for allowing agents to learn which are the best tasks to perform on the basis of each agent's skills and the contribution to the team global performance. Agents learn by interacting with the environment and other teammates, and get rewards from the result of the actions they perform. This algorithm is specially designed for problems where all robots have to co-operate and work simultaneously towards the same goal. One example of such a problem is role distribution in a team of heterogeneous robots that form a soccer team, where all members take decisions and co-operate simultaneously. Soccer offers the possibility of conducting research in MAS, where co-operation plays a very important role in a dynamical and changing environment. For these reasons and the experience of the University of Girona in this domain, soccer has been selected as the test-bed for this research. In the case of soccer, tasks are grouped by means of roles. One of the most interesting features of this algorithm is that it endows MAS with a high adaptability to changes in the environment. It allows the team to perform their tasks, while adapting to the environment. This is studied in several cases, for changes in the environment and in the robot's body. Other features are also analysed, especially a parameter that defines the fitness (biological concept) of each agent in the system, which contributes to performance and team adaptability. The algorithm is applied later to allow agents to learn in teams of homogeneous and heterogeneous robots which roles they have to select, in order to maximise team performance. The teams are compared and the performance is evaluated in the games against three hand-coded teams and against the different homogeneous and heterogeneous teams built in this thesis. This section focuses on the analysis of performance and task partition, in order to study the benefits of heterogeneity in physical MAS. In order to study heterogeneity from a rigorous point of view, a diversity measure is developed building on the hierarchic social entropy defined by Tucker Balch. This is adapted to quantify physical diversity in robot teams. This tool presents very interesting features, as it can be used in the future to design heterogeneous teams on the basis of the knowledge on other teams.
Resumo:
This work proposes an animated pedagogical agent that has the role of providing emotional support to the student: motivating and encouraging him, making him believe in his self-ability, and promoting a positive mood in him, which fosters learning. This careful support of the agent, its affective tactics, is expressed through emotional behaviour and encouragement messages of the lifelike character. Due to human social tendency of anthropomorphising software, we believe that a software agent can accomplish this affective role. In order to choose the adequate affective tactics, the agent should also know the student’s emotions. The proposed agent recognises the student’s emotions: joy/distress, satisfaction/disappointment, anger/gratitude, and shame, from the student’s observable behaviour, i. e. his actions in the interface of the educational system. The inference of emotions is psychologically grounded on the cognitive theory of emotions. More specifically, we use the OCC model which is based on the cognitive approach of emotion and can be computationally implemented. Due to the dynamic nature of the student’s affective information, we adopted a BDI approach to implement the affective user model and the affective diagnosis. Besides, in our work we profit from the reasoning capacity of the BDI approach in order for the agent to deduce the student’s appraisal, which allows it to infer the student’s emotions. As a case study, the proposed agent is implemented as the Mediating Agent of MACES: an educational collaborative environment modelled as a multi-agent system and pedagogically based on the sociocultural theory of Vygotsky.
Resumo:
We propose a new paradigm for collective learning in multi-agent systems (MAS) as a solution to the problem in which several agents acting over the same environment must learn how to perform tasks, simultaneously, based on feedbacks given by each one of the other agents. We introduce the proposed paradigm in the form of a reinforcement learning algorithm, nominating it as reinforcement learning with influence values. While learning by rewards, each agent evaluates the relation between the current state and/or action executed at this state (actual believe) together with the reward obtained after all agents that are interacting perform their actions. The reward is a result of the interference of others. The agent considers the opinions of all its colleagues in order to attempt to change the values of its states and/or actions. The idea is that the system, as a whole, must reach an equilibrium, where all agents get satisfied with the obtained results. This means that the values of the state/actions pairs match the reward obtained by each agent. This dynamical way of setting the values for states and/or actions makes this new reinforcement learning paradigm the first to include, naturally, the fact that the presence of other agents in the environment turns it a dynamical model. As a direct result, we implicitly include the internal state, the actions and the rewards obtained by all the other agents in the internal state of each agent. This makes our proposal the first complete solution to the conceptual problem that rises when applying reinforcement learning in multi-agent systems, which is caused by the difference existent between the environment and agent models. With basis on the proposed model, we create the IVQ-learning algorithm that is exhaustive tested in repetitive games with two, three and four agents and in stochastic games that need cooperation and in games that need collaboration. This algorithm shows to be a good option for obtaining solutions that guarantee convergence to the Nash optimum equilibrium in cooperative problems. Experiments performed clear shows that the proposed paradigm is theoretical and experimentally superior to the traditional approaches. Yet, with the creation of this new paradigm the set of reinforcement learning applications in MAS grows up. That is, besides the possibility of applying the algorithm in traditional learning problems in MAS, as for example coordination of tasks in multi-robot systems, it is possible to apply reinforcement learning in problems that are essentially collaborative
Resumo:
Humans and animals face decision tasks in an uncertain multi-agent environment where an agent's strategy may change in time due to the co-adaptation of others strategies. The neuronal substrate and the computational algorithms underlying such adaptive decision making, however, is largely unknown. We propose a population coding model of spiking neurons with a policy gradient procedure that successfully acquires optimal strategies for classical game-theoretical tasks. The suggested population reinforcement learning reproduces data from human behavioral experiments for the blackjack and the inspector game. It performs optimally according to a pure (deterministic) and mixed (stochastic) Nash equilibrium, respectively. In contrast, temporal-difference(TD)-learning, covariance-learning, and basic reinforcement learning fail to perform optimally for the stochastic strategy. Spike-based population reinforcement learning, shown to follow the stochastic reward gradient, is therefore a viable candidate to explain automated decision learning of a Nash equilibrium in two-player games.
Resumo:
Multi-dimensional Bayesian network classifiers (MBCs) are probabilistic graphical models recently proposed to deal with multi-dimensional classification problems, where each instance in the data set has to be assigned to more than one class variable. In this paper, we propose a Markov blanket-based approach for learning MBCs from data. Basically, it consists of determining the Markov blanket around each class variable using the HITON algorithm, then specifying the directionality over the MBC subgraphs. Our approach is applied to the prediction problem of the European Quality of Life-5 Dimensions (EQ-5D) from the 39-item Parkinson’s Disease Questionnaire (PDQ-39) in order to estimate the health-related quality of life of Parkinson’s patients. Fivefold cross-validation experiments were carried out on randomly generated synthetic data sets, Yeast data set, as well as on a real-world Parkinson’s disease data set containing 488 patients. The experimental study, including comparison with additional Bayesian network-based approaches, back propagation for multi-label learning, multi-label k-nearest neighbor, multinomial logistic regression, ordinary least squares, and censored least absolute deviations, shows encouraging results in terms of predictive accuracy as well as the identification of dependence relationships among class and feature variables.
Resumo:
This paper presents an adaptable InfoStation-based multi-agent system facilitating the mobile eLearning (mLearning) service provision within a University Campus. A horizontal view of the network architecture is presented. Main communications scenarios are considered by describing the detailed interaction of the system entities involved in the mLearning service provision. The mTest service is explored as a practical example. System implementation approaches are also considered.
Resumo:
Postprint
Resumo:
The integration of distributed and ubiquitous intelligence has emerged over the last years as the mainspring of transformative advancements in mobile radio networks. As we approach the era of “mobile for intelligence”, next-generation wireless networks are poised to undergo significant and profound changes. Notably, the overarching challenge that lies ahead is the development and implementation of integrated communication and learning mechanisms that will enable the realization of autonomous mobile radio networks. The ultimate pursuit of eliminating human-in-the-loop constitutes an ambitious challenge, necessitating a meticulous delineation of the fundamental characteristics that artificial intelligence (AI) should possess to effectively achieve this objective. This challenge represents a paradigm shift in the design, deployment, and operation of wireless networks, where conventional, static configurations give way to dynamic, adaptive, and AI-native systems capable of self-optimization, self-sustainment, and learning. This thesis aims to provide a comprehensive exploration of the fundamental principles and practical approaches required to create autonomous mobile radio networks that seamlessly integrate communication and learning components. The first chapter of this thesis introduces the notion of Predictive Quality of Service (PQoS) and adaptive optimization and expands upon the challenge to achieve adaptable, reliable, and robust network performance in dynamic and ever-changing environments. The subsequent chapter delves into the revolutionary role of generative AI in shaping next-generation autonomous networks. This chapter emphasizes achieving trustworthy uncertainty-aware generation processes with the use of approximate Bayesian methods and aims to show how generative AI can improve generalization while reducing data communication costs. Finally, the thesis embarks on the topic of distributed learning over wireless networks. Distributed learning and its declinations, including multi-agent reinforcement learning systems and federated learning, have the potential to meet the scalability demands of modern data-driven applications, enabling efficient and collaborative model training across dynamic scenarios while ensuring data privacy and reducing communication overhead.
Resumo:
In this paper we describe a distributed object oriented logic programming language in which an object is a collection of threads deductively accessing and updating a shared logic program. The key features of the language, such as static and dynamic object methods and multiple inheritance, are illustrated through a series of small examples. We show how we can implement object servers, allowing remote spawning of objects, which we can use as staging posts for mobile agents. We give as an example an information gathering mobile agent that can be queried about the information it has so far gathered whilst it is gathering new information. Finally we define a class of co-operative reasoning agents that can do resource bounded inference for full first order predicate logic, handling multiple queries and information updates concurrently. We believe that the combination of the concurrent OO and the LP programming paradigms produces a powerful tool for quickly implementing rational multi-agent applications on the internet.