971 resultados para Q-Learning
Resumo:
Lee, M., Meng, Q. (2005). Psychologically Inspired Sensory-Motor Development in Early Robot Learning. International Journal of Advanced Robotic Systems, 325-334.
Resumo:
M.H. Lee and Q. Meng, 'Psychologically Inspired Sensory-Motor Development in Early Robot Learning', in proceedings of Towards Autonomous Robotic Systems 2005 (TAROS-05), Nehmzow, U., Melhuish, C. and Witkowski, M. (Eds.), Imperial College London, 157-163, September 2005. See published version: http://hdl.handle.net/2160/485
Resumo:
We investigate the problem of learning disjunctions of counting functions, which are general cases of parity and modulo functions, with equivalence and membership queries. We prove that, for any prime number p, the class of disjunctions of integer-weighted counting functions with modulus p over the domain Znq (or Zn) for any given integer q ≥ 2 is polynomial time learnable using at most n + 1 equivalence queries, where the hypotheses issued by the learner are disjunctions of at most n counting functions with weights from Zp. The result is obtained through learning linear systems over an arbitrary field. In general a counting function may have a composite modulus. We prove that, for any given integer q ≥ 2, over the domain Zn2, the class of read-once disjunctions of Boolean-weighted counting functions with modulus q is polynomial time learnable with only one equivalence query, and the class of disjunctions of log log n Boolean-weighted counting functions with modulus q is polynomial time learnable. Finally, we present an algorithm for learning graph-based counting functions.
Resumo:
We extend the contingent valuation (CV) method to test three differing conceptions of individuals' preferences as either (i) a-priori well-formed or readily divined and revealed through a single dichotomous choice question (as per the NOAA CV guidelines [K. Arrow, R. Solow, P.R. Portney, E.E. Learner, R. Radner, H. Schuman, Report of the NOAA panel on contingent valuation, Fed. Reg. 58 (1993) 4601-4614]); (ii) learned or 'discovered' through a process of repetition and experience [J.A. List, Does market experience eliminate market anomalies? Q. J. Econ. (2003) 41-72; C.R. Plott, Rational individual behaviour in markets and social choice processes: the discovered preference hypothesis, in: K. Arrow, E. Colombatto, M. Perleman, C. Schmidt (Eds.), Rational Foundations of Economic Behaviour, Macmillan, London, St. Martin's, New York, 1996, pp. 225-250]; (iii) internally coherent but strongly influenced by some initial arbitrary anchor [D. Ariely, G. Loewenstein, D. Prelec, 'Coherent arbitrariness': stable demand curves without stable preferences, Q. J. Econ. 118(l) (2003) 73-105]. Findings reject both the first and last of these conceptions in favour of a model in which preferences converge towards standard expectations through a process of repetition and learning. In doing so, we show that such a 'learning design CV method overturns the 'stylised facts' of bias and anchoring within the double bound dichotomous choice elicitation format. (C) 2007 Elsevier Inc. All rights reserved.
Resumo:
The purpose of the present report is to describe a community needs assessment that puts the process and choice of a suitable approach into a context. The study examined the mental health needs of children and youth with learning disabilities and their families and how they fit within the continuum of services in Metropolitan Toronto. A series of recommendations was developed for the Ministry of Community and Social Services. The recommendations emphasize: prevention, training and consultation, and research. The study illustrates the importance of involving relevant constituencies in both the planning of a needs assessment and the formulation and implementation of recommendations based on the investigation.
Resumo:
According to a higher order reasoning account, inferential reasoning processes underpin the widely observed cue competition effect of blocking in causal learning. The inference required for blocking has been described as modus tollens (if p then q, not q therefore not p). Young children are known to have difficulties with this type of inference, but research with adults suggests that this inference is easier if participants think counterfactually. In this study, 100 children (51 five-year-olds and 49 six- to seven-year-olds) were assigned to two types of pretraining groups. The counterfactual group observed demonstrations of cues paired with outcomes and answered questions about what the outcome would have been if the causal status of cues had been different, whereas the factual group answered factual questions about the same demonstrations. Children then completed a causal learning task. Counterfactual pretraining enhanced levels of blocking as well as modus tollens reasoning but only for the younger children. These findings provide new evidence for an important role for inferential reasoning in causal learning.
Resumo:
This work presents novel algorithms for learning Bayesian networks of bounded treewidth. Both exact and approximate methods are developed. The exact method combines mixed integer linear programming formulations for structure learning and treewidth computation. The approximate method consists in sampling k-trees (maximal graphs of treewidth k), and subsequently selecting, exactly or approximately, the best structure whose moral graph is a subgraph of that k-tree. The approaches are empirically compared to each other and to state-of-the-art methods on a collection of public data sets with up to 100 variables.
Resumo:
This paper addresses the problem of learning Bayesian network structures from data based on score functions that are decomposable. It describes properties that strongly reduce the time and memory costs of many known methods without losing global optimality guarantees. These properties are derived for different score criteria such as Minimum Description Length (or Bayesian Information Criterion), Akaike Information Criterion and Bayesian Dirichlet Criterion. Then a branch-and-bound algorithm is presented that integrates structural constraints with data in a way to guarantee global optimality. As an example, structural constraints are used to map the problem of structure learning in Dynamic Bayesian networks into a corresponding augmented Bayesian network. Finally, we show empirically the benefits of using the properties with state-of-the-art methods and with the new algorithm, which is able to handle larger data sets than before.
Resumo:
Economic and environmental load dispatch aims to determine the amount of electricity generated from power plants to meet load demand while minimizing fossil fuel costs and air pollution emissions subject to operational and licensing requirements. These two scheduling problems are commonly formulated with non-smooth cost functions respectively considering various effects and constraints, such as the valve point effect, power balance and ramp rate limits. The expected increase in plug-in electric vehicles is likely to see a significant impact on the power system due to high charging power consumption and significant uncertainty in charging times. In this paper, multiple electric vehicle charging profiles are comparatively integrated into a 24-hour load demand in an economic and environment dispatch model. Self-learning teaching-learning based optimization (TLBO) is employed to solve the non-convex non-linear dispatch problems. Numerical results on well-known benchmark functions, as well as test systems with different scales of generation units show the significance of the new scheduling method.
Resumo:
Learning Bayesian networks with bounded tree-width has attracted much attention recently, because low tree-width allows exact inference to be performed efficiently. Some existing methods [12, 14] tackle the problem by using k-trees to learn the optimal Bayesian network with tree-width up to k. In this paper, we propose a sampling method to efficiently find representative k-trees by introducing an Informative score function to characterize the quality of a k-tree. The proposed algorithm can efficiently learn a Bayesian network with tree-width at most k. Experiment results indicate that our approach is comparable with exact methods, but is much more computationally efficient.
Resumo:
Bounding the tree-width of a Bayesian network can reduce the chance of overfitting, and allows exact inference to be performed efficiently. Several existing algorithms tackle the problem of learning bounded tree-width Bayesian networks by learning from k-trees as super-structures, but they do not scale to large domains and/or large tree-width. We propose a guided search algorithm to find k-trees with maximum Informative scores, which is a measure of quality for the k-tree in yielding good Bayesian networks. The algorithm achieves close to optimal performance compared to exact solutions in small domains, and can discover better networks than existing approximate methods can in large domains. It also provides an optimal elimination order of variables that guarantees small complexity for later runs of exact inference. Comparisons with well-known approaches in terms of learning and inference accuracy illustrate its capabilities.
Resumo:
This paper presents a Reinforcement Learning (RL) approach to economic dispatch (ED) using Radial Basis Function neural network. We formulate the ED as an N stage decision making problem. We propose a novel architecture to store Qvalues and present a learning algorithm to learn the weights of the neural network. Even though many stochastic search techniques like simulated annealing, genetic algorithm and evolutionary programming have been applied to ED, they require searching for the optimal solution for each load demand. Also they find limitation in handling stochastic cost functions. In our approach once we learn the Q-values, we can find the dispatch for any load demand. We have recently proposed a RL approach to ED. In that approach, we could find only the optimum dispatch for a set of specified discrete values of power demand. The performance of the proposed algorithm is validated by taking IEEE 6 bus system, considering transmission losses