799 resultados para Adaptive Learning Systems
Resumo:
Self-adaptive software provides a profound solution for adapting applications to changing contexts in dynamic and heterogeneous environments. Having emerged from Autonomic Computing, it incorporates fully autonomous decision making based on predefined structural and behavioural models. The most common approach for architectural runtime adaptation is the MAPE-K adaptation loop implementing an external adaptation manager without manual user control. However, it has turned out that adaptation behaviour lacks acceptance if it does not correspond to a user’s expectations – particularly for Ubiquitous Computing scenarios with user interaction. Adaptations can be irritating and distracting if they are not appropriate for a certain situation. In general, uncertainty during development and at run-time causes problems with users being outside the adaptation loop. In a literature study, we analyse publications about self-adaptive software research. The results show a discrepancy between the motivated application domains, the maturity of examples, and the quality of evaluations on the one hand and the provided solutions on the other hand. Only few publications analysed the impact of their work on the user, but many employ user-oriented examples for motivation and demonstration. To incorporate the user within the adaptation loop and to deal with uncertainty, our proposed solutions enable user participation for interactive selfadaptive software while at the same time maintaining the benefits of intelligent autonomous behaviour. We define three dimensions of user participation, namely temporal, behavioural, and structural user participation. This dissertation contributes solutions for user participation in the temporal and behavioural dimension. The temporal dimension addresses the moment of adaptation which is classically determined by the self-adaptive system. We provide mechanisms allowing users to influence or to define the moment of adaptation. With our solution, users can have full control over the moment of adaptation or the self-adaptive software considers the user’s situation more appropriately. The behavioural dimension addresses the actual adaptation logic and the resulting run-time behaviour. Application behaviour is established during development and does not necessarily match the run-time expectations. Our contributions are three distinct solutions which allow users to make changes to the application’s runtime behaviour: dynamic utility functions, fuzzy-based reasoning, and learning-based reasoning. The foundation of our work is a notification and feedback solution that improves intelligibility and controllability of self-adaptive applications by implementing a bi-directional communication between self-adaptive software and the user. The different mechanisms from the temporal and behavioural participation dimension require the notification and feedback solution to inform users on adaptation actions and to provide a mechanism to influence adaptations. Case studies show the feasibility of the developed solutions. Moreover, an extensive user study with 62 participants was conducted to evaluate the impact of notifications before and after adaptations. Although the study revealed that there is no preference for a particular notification design, participants clearly appreciated intelligibility and controllability over autonomous adaptations.
Resumo:
We describe an adaptive, mid-level approach to the wireless device power management problem. Our approach is based on reinforcement learning, a machine learning framework for autonomous agents. We describe how our framework can be applied to the power management problem in both infrastructure and ad~hoc wireless networks. From this thesis we conclude that mid-level power management policies can outperform low-level policies and are more convenient to implement than high-level policies. We also conclude that power management policies need to adapt to the user and network, and that a mid-level power management framework based on reinforcement learning fulfills these requirements.
Resumo:
As AI has begun to reach out beyond its symbolic, objectivist roots into the embodied, experientialist realm, many projects are exploring different aspects of creating machines which interact with and respond to the world as humans do. Techniques for visual processing, object recognition, emotional response, gesture production and recognition, etc., are necessary components of a complete humanoid robot. However, most projects invariably concentrate on developing a few of these individual components, neglecting the issue of how all of these pieces would eventually fit together. The focus of the work in this dissertation is on creating a framework into which such specific competencies can be embedded, in a way that they can interact with each other and build layers of new functionality. To be of any practical value, such a framework must satisfy the real-world constraints of functioning in real-time with noisy sensors and actuators. The humanoid robot Cog provides an unapologetically adequate platform from which to take on such a challenge. This work makes three contributions to embodied AI. First, it offers a general-purpose architecture for developing behavior-based systems distributed over networks of PC's. Second, it provides a motor-control system that simulates several biological features which impact the development of motor behavior. Third, it develops a framework for a system which enables a robot to learn new behaviors via interacting with itself and the outside world. A few basic functional modules are built into this framework, enough to demonstrate the robot learning some very simple behaviors taught by a human trainer. A primary motivation for this project is the notion that it is practically impossible to build an "intelligent" machine unless it is designed partly to build itself. This work is a proof-of-concept of such an approach to integrating multiple perceptual and motor systems into a complete learning agent.
Resumo:
In most classical frameworks for learning from examples, it is assumed that examples are randomly drawn and presented to the learner. In this paper, we consider the possibility of a more active learner who is allowed to choose his/her own examples. Our investigations are carried out in a function approximation setting. In particular, using arguments from optimal recovery (Micchelli and Rivlin, 1976), we develop an adaptive sampling strategy (equivalent to adaptive approximation) for arbitrary approximation schemes. We provide a general formulation of the problem and show how it can be regarded as sequential optimal recovery. We demonstrate the application of this general formulation to two special cases of functions on the real line 1) monotonically increasing functions and 2) functions with bounded derivative. An extensive investigation of the sample complexity of approximating these functions is conducted yielding both theoretical and empirical results on test functions. Our theoretical results (stated insPAC-style), along with the simulations demonstrate the superiority of our active scheme over both passive learning as well as classical optimal recovery. The analysis of active function approximation is conducted in a worst-case setting, in contrast with other Bayesian paradigms obtained from optimal design (Mackay, 1992).
Resumo:
The objects with which the hand interacts with may significantly change the dynamics of the arm. How does the brain adapt control of arm movements to this new dynamic? We show that adaptation is via composition of a model of the task's dynamics. By exploring generalization capabilities of this adaptation we infer some of the properties of the computational elements with which the brain formed this model: the elements have broad receptive fields and encode the learned dynamics as a map structured in an intrinsic coordinate system closely related to the geometry of the skeletomusculature. The low--level nature of these elements suggests that they may represent asset of primitives with which a movement is represented in the CNS.
Resumo:
In this paper we present a component based person detection system that is capable of detecting frontal, rear and near side views of people, and partially occluded persons in cluttered scenes. The framework that is described here for people is easily applied to other objects as well. The motivation for developing a component based approach is two fold: first, to enhance the performance of person detection systems on frontal and rear views of people and second, to develop a framework that directly addresses the problem of detecting people who are partially occluded or whose body parts blend in with the background. The data classification is handled by several support vector machine classifiers arranged in two layers. This architecture is known as Adaptive Combination of Classifiers (ACC). The system performs very well and is capable of detecting people even when all components of a person are not found. The performance of the system is significantly better than a full body person detector designed along similar lines. This suggests that the improved performance is due to the components based approach and the ACC data classification structure.
Resumo:
In this paper we present a novel approach to assigning roles to robots in a team of physical heterogeneous robots. Its members compete for these roles and get rewards for them. The rewards are used to determine each agent’s preferences and which agents are better adapted to the environment. These aspects are included in the decision making process. Agent interactions are modelled using the concept of an ecosystem in which each robot is a species, resulting in emergent behaviour of the whole set of agents. One of the most important features of this approach is its high adaptability. Unlike some other learning techniques, this approach does not need to start a whole exploitation process when the environment changes. All this is exemplified by means of experiments run on a simulator. In addition, the algorithm developed was applied as applied to several teams of robots in order to analyse the impact of heterogeneity in these systems
Resumo:
This paper proposes a hybrid coordination method for behavior-based control architectures. The hybrid method takes advantages of the robustness and modularity in competitive approaches as well as optimized trajectories in cooperative ones. This paper shows the feasibility of applying this hybrid method with a 3D-navigation to an autonomous underwater vehicle (AUV). The behaviors are learnt online by means of reinforcement learning. A continuous Q-learning implemented with a feed-forward neural network is employed. Realistic simulations were carried out. The results obtained show the good performance of the hybrid method on behavior coordination as well as the convergence of the behaviors
Resumo:
Reinforcement learning (RL) is a very suitable technique for robot learning, as it can learn in unknown environments and in real-time computation. The main difficulties in adapting classic RL algorithms to robotic systems are the generalization problem and the correct observation of the Markovian state. This paper attempts to solve the generalization problem by proposing the semi-online neural-Q_learning algorithm (SONQL). The algorithm uses the classic Q_learning technique with two modifications. First, a neural network (NN) approximates the Q_function allowing the use of continuous states and actions. Second, a database of the most representative learning samples accelerates and stabilizes the convergence. The term semi-online is referred to the fact that the algorithm uses the current but also past learning samples. However, the algorithm is able to learn in real-time while the robot is interacting with the environment. The paper shows simulated results with the "mountain-car" benchmark and, also, real results with an underwater robot in a target following behavior
Resumo:
This paper proposes a field application of a high-level reinforcement learning (RL) control system for solving the action selection problem of an autonomous robot in cable tracking task. The learning system is characterized by using a direct policy search method for learning the internal state/action mapping. Policy only algorithms may suffer from long convergence times when dealing with real robotics. In order to speed up the process, the learning phase has been carried out in a simulated environment and, in a second step, the policy has been transferred and tested successfully on a real robot. Future steps plan to continue the learning process on-line while on the real robot while performing the mentioned task. We demonstrate its feasibility with real experiments on the underwater robot ICTINEU AUV
Resumo:
Autonomous underwater vehicles (AUV) represent a challenging control problem with complex, noisy, dynamics. Nowadays, not only the continuous scientific advances in underwater robotics but the increasing number of subsea missions and its complexity ask for an automatization of submarine processes. This paper proposes a high-level control system for solving the action selection problem of an autonomous robot. The system is characterized by the use of reinforcement learning direct policy search methods (RLDPS) for learning the internal state/action mapping of some behaviors. We demonstrate its feasibility with simulated experiments using the model of our underwater robot URIS in a target following task
Resumo:
This paper proposes a high-level reinforcement learning (RL) control system for solving the action selection problem of an autonomous robot. Although the dominant approach, when using RL, has been to apply value function based algorithms, the system here detailed is characterized by the use of direct policy search methods. Rather than approximating a value function, these methodologies approximate a policy using an independent function approximator with its own parameters, trying to maximize the future expected reward. The policy based algorithm presented in this paper is used for learning the internal state/action mapping of a behavior. In this preliminary work, we demonstrate its feasibility with simulated experiments using the underwater robot GARBI in a target reaching task
Resumo:
This paper presents a first approach of Evaluation Engine Architecture (EEA) as proposal to support adaptive integral assessment, in the context of a virtual learning environment. The goal of our research is design an evaluation engine tool to assist in the whole assessment process within the A2UN@ project, linking that tool with the other key elements of a learning design (learning task, learning resources and learning support). The teachers would define the relation between knowledge, competencies, activities, resources and type of assessment. Providing this relation is possible obtain more accurate estimations of student's knowledge for adaptive evaluations and future recommendations. The process is supported by usage of educational standards and specifications and for an integral user modelling
Resumo:
Engineering of negotiation model allows to develop effective heuristic for business intelligence. Digital ecosystems demand open negotiation models. To define in advance effective heuristics is not compliant with the requirement of openness. The new challenge is to develop business intelligence in advance exploiting an adaptive approach. The idea is to learn business strategy once new negotiation model rise in the e-market arena. In this paper we present how recommendation technology may be deployed in an open negotiation environment where the interaction protocol models are not known in advance. The solution we propose is delivered as part of the ONE Platform, open source software that implements a fully distributed open environment for business negotiation
Resumo:
The speed of fault isolation is crucial for the design and reconfiguration of fault tolerant control (FTC). In this paper the fault isolation problem is stated as a constraint satisfaction problem (CSP) and solved using constraint propagation techniques. The proposed method is based on constraint satisfaction techniques and uncertainty space refining of interval parameters. In comparison with other approaches based on adaptive observers, the major advantage of the presented method is that the isolation speed is fast even taking into account uncertainty in parameters, measurements and model errors and without the monotonicity assumption. In order to illustrate the proposed approach, a case study of a nonlinear dynamic system is presented