861 resultados para Robust Learning Algorithm


Relevância:

100.00% 100.00%

Publicador:

Resumo:

* This research was partially supported by the Latvian Science Foundation under grant No.02-86d.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

This paper investigates how to make improved action selection for online policy learning in robotic scenarios using reinforcement learning (RL) algorithms. Since finding control policies using any RL algorithm can be very time consuming, we propose to combine RL algorithms with heuristic functions for selecting promising actions during the learning process. With this aim, we investigate the use of heuristics for increasing the rate of convergence of RL algorithms and contribute with a new learning algorithm, Heuristically Accelerated Q-learning (HAQL), which incorporates heuristics for action selection to the Q-Learning algorithm. Experimental results on robot navigation show that the use of even very simple heuristic functions results in significant performance enhancement of the learning rate.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Electricity markets are complex environments, involving a large number of different entities, playing in a dynamic scene to obtain the best advantages and profits. MASCEM is a multi-agent electricity market simulator to model market players and simulate their operation in the market. Market players are entities with specific characteristics and objectives, making their decisions and interacting with other players. MASCEM is integrated with ALBidS, a system that provides several dynamic strategies for agents’ behavior. This paper presents a method that aims at enhancing ALBidS competence in endowing market players with adequate strategic bidding capabilities, allowing them to obtain the higher possible gains out of the market. This method uses a reinforcement learning algorithm to learn from experience how to choose the best from a set of possible actions. These actions are defined accordingly to the most probable points of bidding success. With the purpose of accelerating the convergence process, a simulated annealing based algorithm is included.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Electricity markets are complex environments, involving a large number of different entities, playing in a dynamic scene to obtain the best advantages and profits. MASCEM is a multi-agent electricity market simulator to model market players and simulate their operation in the market. Market players are entities with specific characteristics and objectives, making their decisions and interacting with other players. MASCEM provides several dynamic strategies for agents’ behavior. This paper presents a method that aims to provide market players with strategic bidding capabilities, allowing them to obtain the higher possible gains out of the market. This method uses a reinforcement learning algorithm to learn from experience how to choose the best from a set of possible bids. These bids are defined accordingly to the cost function that each producer presents.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

This paper presents the applicability of a reinforcement learning algorithm based on the application of the Bayesian theorem of probability. The proposed reinforcement learning algorithm is an advantageous and indispensable tool for ALBidS (Adaptive Learning strategic Bidding System), a multi-agent system that has the purpose of providing decision support to electricity market negotiating players. ALBidS uses a set of different strategies for providing decision support to market players. These strategies are used accordingly to their probability of success for each different context. The approach proposed in this paper uses a Bayesian network for deciding the most probably successful action at each time, depending on past events. The performance of the proposed methodology is tested using electricity market simulations in MASCEM (Multi-Agent Simulator of Competitive Electricity Markets). MASCEM provides the means for simulating a real electricity market environment, based on real data from real electricity market operators.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Dissertation presented to obtain the Ph.D degree in Biology, Neuroscience

Relevância:

90.00% 90.00%

Publicador:

Resumo:

This paper investigates the role of learning by private agents and the central bank (two-sided learning) in a New Keynesian framework in which both sides of the economy have asymmetric and imperfect knowledge about the true data generating process. We assume that all agents employ the data that they observe (which may be distinct for different sets of agents) to form beliefs about unknown aspects of the true model of the economy, use their beliefs to decide on actions, and revise these beliefs through a statistical learning algorithm as new information becomes available. We study the short-run dynamics of our model and derive its policy recommendations, particularly with respect to central bank communications. We demonstrate that two-sided learning can generate substantial increases in volatility and persistence, and alter the behavior of the variables in the model in a signifficant way. Our simulations do not converge to a symmetric rational expectations equilibrium and we highlight one source that invalidates the convergence results of Marcet and Sargent (1989). Finally, we identify a novel aspect of central bank communication in models of learning: communication can be harmful if the central bank's model is substantially mis-specified

Relevância:

90.00% 90.00%

Publicador:

Resumo:

This paper investigates the role of learning by private agents and the central bank(two-sided learning) in a New Keynesian framework in which both sides of the economyhave asymmetric and imperfect knowledge about the true data generating process. Weassume that all agents employ the data that they observe (which may be distinct fordifferent sets of agents) to form beliefs about unknown aspects of the true model ofthe economy, use their beliefs to decide on actions, and revise these beliefs througha statistical learning algorithm as new information becomes available. We study theshort-run dynamics of our model and derive its policy recommendations, particularlywith respect to central bank communications. We demonstrate that two-sided learningcan generate substantial increases in volatility and persistence, and alter the behaviorof the variables in the model in a significant way. Our simulations do not convergeto a symmetric rational expectations equilibrium and we highlight one source thatinvalidates the convergence results of Marcet and Sargent (1989). Finally, we identifya novel aspect of central bank communication in models of learning: communicationcan be harmful if the central bank's model is substantially mis-specified.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Fluvial deposits are a challenge for modelling flow in sub-surface reservoirs. Connectivity and continuity of permeable bodies have a major impact on fluid flow in porous media. Contemporary object-based and multipoint statistics methods face a problem of robust representation of connected structures. An alternative approach to model petrophysical properties is based on machine learning algorithm ? Support Vector Regression (SVR). Semi-supervised SVR is able to establish spatial connectivity taking into account the prior knowledge on natural similarities. SVR as a learning algorithm is robust to noise and captures dependencies from all available data. Semi-supervised SVR applied to a synthetic fluvial reservoir demonstrated robust results, which are well matched to the flow performance

Relevância:

90.00% 90.00%

Publicador:

Resumo:

In this paper, we consider active sampling to label pixels grouped with hierarchical clustering. The objective of the method is to match the data relationships discovered by the clustering algorithm with the user's desired class semantics. The first is represented as a complete tree to be pruned and the second is iteratively provided by the user. The active learning algorithm proposed searches the pruning of the tree that best matches the labels of the sampled points. By choosing the part of the tree to sample from according to current pruning's uncertainty, sampling is focused on most uncertain clusters. This way, large clusters for which the class membership is already fixed are no longer queried and sampling is focused on division of clusters showing mixed labels. The model is tested on a VHR image in a multiclass classification setting. The method clearly outperforms random sampling in a transductive setting, but cannot generalize to unseen data, since it aims at optimizing the classification of a given cluster structure.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Recent advances in machine learning methods enable increasingly the automatic construction of various types of computer assisted methods that have been difficult or laborious to program by human experts. The tasks for which this kind of tools are needed arise in many areas, here especially in the fields of bioinformatics and natural language processing. The machine learning methods may not work satisfactorily if they are not appropriately tailored to the task in question. However, their learning performance can often be improved by taking advantage of deeper insight of the application domain or the learning problem at hand. This thesis considers developing kernel-based learning algorithms incorporating this kind of prior knowledge of the task in question in an advantageous way. Moreover, computationally efficient algorithms for training the learning machines for specific tasks are presented. In the context of kernel-based learning methods, the incorporation of prior knowledge is often done by designing appropriate kernel functions. Another well-known way is to develop cost functions that fit to the task under consideration. For disambiguation tasks in natural language, we develop kernel functions that take account of the positional information and the mutual similarities of words. It is shown that the use of this information significantly improves the disambiguation performance of the learning machine. Further, we design a new cost function that is better suitable for the task of information retrieval and for more general ranking problems than the cost functions designed for regression and classification. We also consider other applications of the kernel-based learning algorithms such as text categorization, and pattern recognition in differential display. We develop computationally efficient algorithms for training the considered learning machines with the proposed kernel functions. We also design a fast cross-validation algorithm for regularized least-squares type of learning algorithm. Further, an efficient version of the regularized least-squares algorithm that can be used together with the new cost function for preference learning and ranking tasks is proposed. In summary, we demonstrate that the incorporation of prior knowledge is possible and beneficial, and novel advanced kernels and cost functions can be used in algorithms efficiently.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Network virtualisation is considerably gaining attentionas a solution to ossification of the Internet. However, thesuccess of network virtualisation will depend in part on how efficientlythe virtual networks utilise substrate network resources.In this paper, we propose a machine learning-based approachto virtual network resource management. We propose to modelthe substrate network as a decentralised system and introducea learning algorithm in each substrate node and substrate link,providing self-organization capabilities. We propose a multiagentlearning algorithm that carries out the substrate network resourcemanagement in a coordinated and decentralised way. The taskof these agents is to use evaluative feedback to learn an optimalpolicy so as to dynamically allocate network resources to virtualnodes and links. The agents ensure that while the virtual networkshave the resources they need at any given time, only the requiredresources are reserved for this purpose. Simulations show thatour dynamic approach significantly improves the virtual networkacceptance ratio and the maximum number of accepted virtualnetwork requests at any time while ensuring that virtual networkquality of service requirements such as packet drop rate andvirtual link delay are not affected.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Cette thèse envisage un ensemble de méthodes permettant aux algorithmes d'apprentissage statistique de mieux traiter la nature séquentielle des problèmes de gestion de portefeuilles financiers. Nous débutons par une considération du problème général de la composition d'algorithmes d'apprentissage devant gérer des tâches séquentielles, en particulier celui de la mise-à-jour efficace des ensembles d'apprentissage dans un cadre de validation séquentielle. Nous énumérons les desiderata que des primitives de composition doivent satisfaire, et faisons ressortir la difficulté de les atteindre de façon rigoureuse et efficace. Nous poursuivons en présentant un ensemble d'algorithmes qui atteignent ces objectifs et présentons une étude de cas d'un système complexe de prise de décision financière utilisant ces techniques. Nous décrivons ensuite une méthode générale permettant de transformer un problème de décision séquentielle non-Markovien en un problème d'apprentissage supervisé en employant un algorithme de recherche basé sur les K meilleurs chemins. Nous traitons d'une application en gestion de portefeuille où nous entraînons un algorithme d'apprentissage à optimiser directement un ratio de Sharpe (ou autre critère non-additif incorporant une aversion au risque). Nous illustrons l'approche par une étude expérimentale approfondie, proposant une architecture de réseaux de neurones spécialisée à la gestion de portefeuille et la comparant à plusieurs alternatives. Finalement, nous introduisons une représentation fonctionnelle de séries chronologiques permettant à des prévisions d'être effectuées sur un horizon variable, tout en utilisant un ensemble informationnel révélé de manière progressive. L'approche est basée sur l'utilisation des processus Gaussiens, lesquels fournissent une matrice de covariance complète entre tous les points pour lesquels une prévision est demandée. Cette information est utilisée à bon escient par un algorithme qui transige activement des écarts de cours (price spreads) entre des contrats à terme sur commodités. L'approche proposée produit, hors échantillon, un rendement ajusté pour le risque significatif, après frais de transactions, sur un portefeuille de 30 actifs.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

In this paper, a new methodology for the prediction of scoliosis curve types from non invasive acquisitions of the back surface of the trunk is proposed. One hundred and fifty-nine scoliosis patients had their back surface acquired in 3D using an optical digitizer. Each surface is then characterized by 45 local measurements of the back surface rotation. Using a semi-supervised algorithm, the classifier is trained with only 32 labeled and 58 unlabeled data. Tested on 69 new samples, the classifier succeeded in classifying correctly 87.0% of the data. After reducing the number of labeled training samples to 12, the behavior of the resulting classifier tends to be similar to the reference case where the classifier is trained only with the maximum number of available labeled data. Moreover, the addition of unlabeled data guided the classifier towards more generalizable boundaries between the classes. Those results provide a proof of feasibility for using a semi-supervised learning algorithm to train a classifier for the prediction of a scoliosis curve type, when only a few training data are labeled. This constitutes a promising clinical finding since it will allow the diagnosis and the follow-up of scoliotic deformities without exposing the patient to X-ray radiations.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

This paper presents a Reinforcement Learning (RL) approach to economic dispatch (ED) using Radial Basis Function neural network. We formulate the ED as an N stage decision making problem. We propose a novel architecture to store Qvalues and present a learning algorithm to learn the weights of the neural network. Even though many stochastic search techniques like simulated annealing, genetic algorithm and evolutionary programming have been applied to ED, they require searching for the optimal solution for each load demand. Also they find limitation in handling stochastic cost functions. In our approach once we learn the Q-values, we can find the dispatch for any load demand. We have recently proposed a RL approach to ED. In that approach, we could find only the optimum dispatch for a set of specified discrete values of power demand. The performance of the proposed algorithm is validated by taking IEEE 6 bus system, considering transmission losses