817 resultados para multi-agent learning


Relevância:

30.00% 30.00%

Publicador:

Resumo:

L’objectif de cette thèse par articles est de présenter modestement quelques étapes du parcours qui mènera (on espère) à une solution générale du problème de l’intelligence artificielle. Cette thèse contient quatre articles qui présentent chacun une différente nouvelle méthode d’inférence perceptive en utilisant l’apprentissage machine et, plus particulièrement, les réseaux neuronaux profonds. Chacun de ces documents met en évidence l’utilité de sa méthode proposée dans le cadre d’une tâche de vision par ordinateur. Ces méthodes sont applicables dans un contexte plus général, et dans certains cas elles on tété appliquées ailleurs, mais ceci ne sera pas abordé dans le contexte de cette de thèse. Dans le premier article, nous présentons deux nouveaux algorithmes d’inférence variationelle pour le modèle génératif d’images appelé codage parcimonieux “spike- and-slab” (CPSS). Ces méthodes d’inférence plus rapides nous permettent d’utiliser des modèles CPSS de tailles beaucoup plus grandes qu’auparavant. Nous démontrons qu’elles sont meilleures pour extraire des détecteur de caractéristiques quand très peu d’exemples étiquetés sont disponibles pour l’entraînement. Partant d’un modèle CPSS, nous construisons ensuite une architecture profonde, la machine de Boltzmann profonde partiellement dirigée (MBP-PD). Ce modèle a été conçu de manière à simplifier d’entraînement des machines de Boltzmann profondes qui nécessitent normalement une phase de pré-entraînement glouton pour chaque couche. Ce problème est réglé dans une certaine mesure, mais le coût d’inférence dans le nouveau modèle est relativement trop élevé pour permettre de l’utiliser de manière pratique. Dans le deuxième article, nous revenons au problème d’entraînement joint de machines de Boltzmann profondes. Cette fois, au lieu de changer de famille de modèles, nous introduisons un nouveau critère d’entraînement qui donne naissance aux machines de Boltzmann profondes à multiples prédictions (MBP-MP). Les MBP-MP sont entraînables en une seule étape et ont un meilleur taux de succès en classification que les MBP classiques. Elles s’entraînent aussi avec des méthodes variationelles standard au lieu de nécessiter un classificateur discriminant pour obtenir un bon taux de succès en classification. Par contre, un des inconvénients de tels modèles est leur incapacité de générer deséchantillons, mais ceci n’est pas trop grave puisque la performance de classification des machines de Boltzmann profondes n’est plus une priorité étant donné les dernières avancées en apprentissage supervisé. Malgré cela, les MBP-MP demeurent intéressantes parce qu’elles sont capable d’accomplir certaines tâches que des modèles purement supervisés ne peuvent pas faire, telles que celle de classifier des données incomplètes ou encore celle de combler intelligemment l’information manquante dans ces données incomplètes. Le travail présenté dans cette thèse s’est déroulé au milieu d’une période de transformations importantes du domaine de l’apprentissage à réseaux neuronaux profonds qui a été déclenchée par la découverte de l’algorithme de “dropout” par Geoffrey Hinton. Dropout rend possible un entraînement purement supervisé d’architectures de propagation unidirectionnel sans être exposé au danger de sur- entraînement. Le troisième article présenté dans cette thèse introduit une nouvelle fonction d’activation spécialement con ̧cue pour aller avec l’algorithme de Dropout. Cette fonction d’activation, appelée maxout, permet l’utilisation de aggrégation multi-canal dans un contexte d’apprentissage purement supervisé. Nous démontrons comment plusieurs tâches de reconnaissance d’objets sont mieux accomplies par l’utilisation de maxout. Pour terminer, sont présentons un vrai cas d’utilisation dans l’industrie pour la transcription d’adresses de maisons à plusieurs chiffres. En combinant maxout avec une nouvelle sorte de couche de sortie pour des réseaux neuronaux de convolution, nous démontrons qu’il est possible d’atteindre un taux de succès comparable à celui des humains sur un ensemble de données coriace constitué de photos prises par les voitures de Google. Ce système a été déployé avec succès chez Google pour lire environ cent million d’adresses de maisons.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Les restructurations et les mutations de plus en plus nombreuses dans les entreprises font évoluer la trajectoire de carrière des employés vers un cheminement moins linéaire et amènent une multiplication des changements de rôle (Delobbe & Vandenberghe, 2000). Les organisations doivent de plus en plus se soucier de l’intégration de ces nouveaux employés afin de leur transmettre les éléments fondamentaux du fonctionnement et de la culture qu’elles privilégient. Par contre, la plupart des recherches sur la socialisation organisationnelle portent sur les « meilleures pratiques », et les résultats qui en découlent sont mixtes. Cette étude comparative cherche à déterminer si et sur quelles variables les nouveaux employés socialisés par leur entreprise diffèrent des nouveaux employés « non socialisés ». Premièrement, cette étude vise à comparer ces deux groupes sur 1) les résultantes proximales (la maîtrise du contenu de la socialisation organisationnelle et la clarté de rôle) et 2) les résultantes distales (l’engagement organisationnel affectif, la satisfaction au travail et l’intention de quitter) du processus de socialisation organisationnelle, ainsi que sur 3) les caractéristiques des réseaux sociaux d’information, en contrôlant pour la proactivité. Dans un second temps, cette étude a pour objectif d’explorer si le processus de socialisation organisationnelle (les relations entre les variables) diffère entre les nouveaux employés socialisés ou non. Cinquante-trois nouveaux employés (moins d’un an d’ancienneté) d’une grande entreprise québécoise ont participé à cette étude. L’entreprise a un programme de socialisation en place, mais son exécution est laissée à la discrétion de chaque département, créant deux catégories de nouveaux employés : ceux qui ont été socialisés par leur département, et ceux qui n’ont pas été socialisés (« non socialisés »). Les participants ont été sondés sur les stratégies proactives, les résultantes proximales et distales et les caractéristiques des réseaux sociaux d’information. Pour le premier objectif, les résultats indiquent que les nouveaux employés socialisés maîtrisent mieux le contenu de la socialisation organisationnelle que les nouveaux employés non socialisés. En ce qui a trait au deuxième objectif, des différences dans le processus de socialisation organisationnelle ont été trouvées. Pour les nouveaux employés « non socialisés », la recherche proactive d’informations et la recherche de rétroaction sont liées à certaines caractéristiques des réseaux sociaux, alors que le cadrage positif est lié à la satisfaction au travail et à l’intention de quitter, et que la clarté de rôle est liée uniquement à la satisfaction au travail. Les nouveaux employés socialisés, quant à eux, démontrent des liens entre la maîtrise du contenu de la socialisation organisationnelle et chacune des résultantes distales (l’engagement organisationnel affectif, la satisfaction au travail et l’intention de quitter). Globalement, l’intégration des nouveaux employés non socialisés serait plutôt influencée par leurs stratégies proactives, tandis que celle des nouveaux employés non socialisés serait facilitée par leur maîtrise du contenu de la socialisation organisationnelle. De façon générale, cette étude comparative offre un aperçu intéressant des nouveaux employés rarement trouvé dans les recherches portant sur les « meilleures pratiques » de la socialisation organisationnelle. Des recommandations pour la recherche et la pratique en suivent.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

One major component of power system operation is generation scheduling. The objective of the work is to develop efficient control strategies to the power scheduling problems through Reinforcement Learning approaches. The three important active power scheduling problems are Unit Commitment, Economic Dispatch and Automatic Generation Control. Numerical solution methods proposed for solution of power scheduling are insufficient in handling large and complex systems. Soft Computing methods like Simulated Annealing, Evolutionary Programming etc., are efficient in handling complex cost functions, but find limitation in handling stochastic data existing in a practical system. Also the learning steps are to be repeated for each load demand which increases the computation time.Reinforcement Learning (RL) is a method of learning through interactions with environment. The main advantage of this approach is it does not require a precise mathematical formulation. It can learn either by interacting with the environment or interacting with a simulation model. Several optimization and control problems have been solved through Reinforcement Learning approach. The application of Reinforcement Learning in the field of Power system has been a few. The objective is to introduce and extend Reinforcement Learning approaches for the active power scheduling problems in an implementable manner. The main objectives can be enumerated as:(i) Evolve Reinforcement Learning based solutions to the Unit Commitment Problem.(ii) Find suitable solution strategies through Reinforcement Learning approach for Economic Dispatch. (iii) Extend the Reinforcement Learning solution to Automatic Generation Control with a different perspective. (iv) Check the suitability of the scheduling solutions to one of the existing power systems.First part of the thesis is concerned with the Reinforcement Learning approach to Unit Commitment problem. Unit Commitment Problem is formulated as a multi stage decision process. Q learning solution is developed to obtain the optimwn commitment schedule. Method of state aggregation is used to formulate an efficient solution considering the minimwn up time I down time constraints. The performance of the algorithms are evaluated for different systems and compared with other stochastic methods like Genetic Algorithm.Second stage of the work is concerned with solving Economic Dispatch problem. A simple and straight forward decision making strategy is first proposed in the Learning Automata algorithm. Then to solve the scheduling task of systems with large number of generating units, the problem is formulated as a multi stage decision making task. The solution obtained is extended in order to incorporate the transmission losses in the system. To make the Reinforcement Learning solution more efficient and to handle continuous state space, a fimction approximation strategy is proposed. The performance of the developed algorithms are tested for several standard test cases. Proposed method is compared with other recent methods like Partition Approach Algorithm, Simulated Annealing etc.As the final step of implementing the active power control loops in power system, Automatic Generation Control is also taken into consideration.Reinforcement Learning has already been applied to solve Automatic Generation Control loop. The RL solution is extended to take up the approach of common frequency for all the interconnected areas, more similar to practical systems. Performance of the RL controller is also compared with that of the conventional integral controller.In order to prove the suitability of the proposed methods to practical systems, second plant ofNeyveli Thennal Power Station (NTPS IT) is taken for case study. The perfonnance of the Reinforcement Learning solution is found to be better than the other existing methods, which provide the promising step towards RL based control schemes for practical power industry.Reinforcement Learning is applied to solve the scheduling problems in the power industry and found to give satisfactory perfonnance. Proposed solution provides a scope for getting more profit as the economic schedule is obtained instantaneously. Since Reinforcement Learning method can take the stochastic cost data obtained time to time from a plant, it gives an implementable method. As a further step, with suitable methods to interface with on line data, economic scheduling can be achieved instantaneously in a generation control center. Also power scheduling of systems with different sources such as hydro, thermal etc. can be looked into and Reinforcement Learning solutions can be achieved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Routine activity theory introduced by Cohen& Felson in 1979 states that criminal acts are caused due to the presenceof criminals, vic-timsand the absence of guardians in time and place. As the number of collision of these elements in place and time increases, criminal acts will also increase even if the number of criminals or civilians remains the same within the vicinity of a city. Street robbery is a typical example of routine ac-tivity theory and the occurrence of which can be predicted using routine activity theory. Agent-based models allow simulation of diversity among individuals. Therefore agent based simulation of street robbery can be used to visualize how chronological aspects of human activity influence the incidence of street robbery.The conceptual model identifies three classes of people-criminals, civilians and police with certain activity areas for each. Police exist only as agents of formal guardianship. Criminals with a tendency for crime will be in the search for their victims. Civilians without criminal tendencycan be either victims or guardians. In addition to criminal tendency, each civilian in the model has a unique set of characteristicslike wealth, employment status, ability for guardianship etc. These agents are subjected to random walk through a street environment guided by a Q –learning module and the possible outcomes are analyzed

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Reinforcement Learning (RL) refers to a class of learning algorithms in which learning system learns which action to take in different situations by using a scalar evaluation received from the environment on performing an action. RL has been successfully applied to many multi stage decision making problem (MDP) where in each stage the learning systems decides which action has to be taken. Economic Dispatch (ED) problem is an important scheduling problem in power systems, which decides the amount of generation to be allocated to each generating unit so that the total cost of generation is minimized without violating system constraints. In this paper we formulate economic dispatch problem as a multi stage decision making problem. In this paper, we also develop RL based algorithm to solve the ED problem. The performance of our algorithm is compared with other recent methods. The main advantage of our method is it can learn the schedule for all possible demands simultaneously.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper presents Reinforcement Learning (RL) approaches to Economic Dispatch problem. In this paper, formulation of Economic Dispatch as a multi stage decision making problem is carried out, then two variants of RL algorithms are presented. A third algorithm which takes into consideration the transmission losses is also explained. Efficiency and flexibility of the proposed algorithms are demonstrated through different representative systems: a three generator system with given generation cost table, IEEE 30 bus system with quadratic cost functions, 10 generator system having piecewise quadratic cost functions and a 20 generator system considering transmission losses. A comparison of the computation times of different algorithms is also carried out.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Unit Commitment Problem (UCP) in power system refers to the problem of determining the on/ off status of generating units that minimize the operating cost during a given time horizon. Since various system and generation constraints are to be satisfied while finding the optimum schedule, UCP turns to be a constrained optimization problem in power system scheduling. Numerical solutions developed are limited for small systems and heuristic methodologies find difficulty in handling stochastic cost functions associated with practical systems. This paper models Unit Commitment as a multi stage decision making task and an efficient Reinforcement Learning solution is formulated considering minimum up time /down time constraints. The correctness and efficiency of the developed solutions are verified for standard test systems

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Unit commitment is an optimization task in electric power generation control sector. It involves scheduling the ON/OFF status of the generating units to meet the load demand with minimum generation cost satisfying the different constraints existing in the system. Numerical solutions developed are limited for small systems and heuristic methodologies find difficulty in handling stochastic cost functions associated with practical systems. This paper models Unit Commitment as a multi stage decision task and Reinforcement Learning solution is formulated through one efficient exploration strategy: Pursuit method. The correctness and efficiency of the developed solutions are verified for standard test systems

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This report addresses the problem of achieving cooperation within small- to medium- sized teams of heterogeneous mobile robots. I describe a software architecture I have developed, called ALLIANCE, that facilitates robust, fault tolerant, reliable, and adaptive cooperative control. In addition, an extended version of ALLIANCE, called L-ALLIANCE, is described, which incorporates a dynamic parameter update mechanism that allows teams of mobile robots to improve the efficiency of their mission performance through learning. A number of experimental results of implementing these architectures on both physical and simulated mobile robot teams are described. In addition, this report presents the results of studies of a number of issues in mobile robot cooperation, including fault tolerant cooperative control, adaptive action selection, distributed control, robot awareness of team member actions, improving efficiency through learning, inter-robot communication, action recognition, and local versus global control.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

As AI has begun to reach out beyond its symbolic, objectivist roots into the embodied, experientialist realm, many projects are exploring different aspects of creating machines which interact with and respond to the world as humans do. Techniques for visual processing, object recognition, emotional response, gesture production and recognition, etc., are necessary components of a complete humanoid robot. However, most projects invariably concentrate on developing a few of these individual components, neglecting the issue of how all of these pieces would eventually fit together. The focus of the work in this dissertation is on creating a framework into which such specific competencies can be embedded, in a way that they can interact with each other and build layers of new functionality. To be of any practical value, such a framework must satisfy the real-world constraints of functioning in real-time with noisy sensors and actuators. The humanoid robot Cog provides an unapologetically adequate platform from which to take on such a challenge. This work makes three contributions to embodied AI. First, it offers a general-purpose architecture for developing behavior-based systems distributed over networks of PC's. Second, it provides a motor-control system that simulates several biological features which impact the development of motor behavior. Third, it develops a framework for a system which enables a robot to learn new behaviors via interacting with itself and the outside world. A few basic functional modules are built into this framework, enough to demonstrate the robot learning some very simple behaviors taught by a human trainer. A primary motivation for this project is the notion that it is practically impossible to build an "intelligent" machine unless it is designed partly to build itself. This work is a proof-of-concept of such an approach to integrating multiple perceptual and motor systems into a complete learning agent.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper we present a novel approach to assigning roles to robots in a team of physical heterogeneous robots. Its members compete for these roles and get rewards for them. The rewards are used to determine each agent’s preferences and which agents are better adapted to the environment. These aspects are included in the decision making process. Agent interactions are modelled using the concept of an ecosystem in which each robot is a species, resulting in emergent behaviour of the whole set of agents. One of the most important features of this approach is its high adaptability. Unlike some other learning techniques, this approach does not need to start a whole exploitation process when the environment changes. All this is exemplified by means of experiments run on a simulator. In addition, the algorithm developed was applied as applied to several teams of robots in order to analyse the impact of heterogeneity in these systems

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Our work is focused on alleviating the workload for designers of adaptive courses on the complexity task of authoring adaptive learning designs adjusted to specific user characteristics and the user context. We propose an adaptation platform that consists in a set of intelligent agents where each agent carries out an independent adaptation task. The agents apply machine learning techniques to support the user modelling for the adaptation process

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The purpose of this paper is to propose a Neural-Q_learning approach designed for online learning of simple and reactive robot behaviors. In this approach, the Q_function is generalized by a multi-layer neural network allowing the use of continuous states and actions. The algorithm uses a database of the most recent learning samples to accelerate and guarantee the convergence. Each Neural-Q_learning function represents an independent, reactive and adaptive behavior which maps sensorial states to robot control actions. A group of these behaviors constitutes a reactive control scheme designed to fulfill simple missions. The paper centers on the description of the Neural-Q_learning based behaviors showing their performance with an underwater robot in a target following task. Real experiments demonstrate the convergence and stability of the learning system, pointing out its suitability for online robot learning. Advantages and limitations are discussed

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper presents the findings of a podcasting trial held in 2007-2008 within the Faculty of Economics and Business at the University of Sydney, Australia. The trial investigates the value of using short-format podcasts to support assessment for postgraduate and undergraduate students. A multi-method approach is taken in investigating perceptions of the benefits of podcasting, incorporating surveys, focus groups and interviews. The results show that a majority of students believe they gained learning benefits from the podcasts and appreciated the flexibility of the medium to support their learning, and the lecturers felt the innovation helped diversify their pedagogical approach and support a diverse student population. Three primary conclusions are presented: (1) most students reject the mobile potential of podcasting in favour of their traditional study space at home; (2) what students and lecturers value about this podcasting design overlap; (3) the assessment-focussed, short-format podcast design may be considered a successful podcasting model. The paper finishes by identifying areas for future research on the effective use of podcasting in learning and teaching.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Los aportes teóricos y aplicados de la complejidad en economía han tomado tantas direcciones y han sido tan frenéticos en las últimas décadas, que no existe un trabajo reciente, hasta donde conocemos, que los compile y los analice de forma integrada. El objetivo de este proyecto, por tanto, es desarrollar un estado situacional de las diferentes aplicaciones conceptuales, teóricas, metodológicas y tecnológicas de las ciencias de la complejidad en la economía. Asimismo, se pretende analizar las tendencias recientes en el estudio de la complejidad de los sistemas económicos y los horizontes que las ciencias de la complejidad ofrecen de cara al abordaje de los fenómenos económicos del mundo globalizado contemporáneo.