114 resultados para reinforcement
Resumo:
A partially observable Markov decision process has been proposed as a dialogue model that enables robustness to speech recognition errors and automatic policy optimisation using reinforcement learning (RL). However, conventional RL algorithms require a very large number of dialogues, necessitating a user simulator. Recently, Gaussian processes have been shown to substantially speed up the optimisation, making it possible to learn directly from interaction with human users. However, early studies have been limited to very low dimensional spaces and the learning has exhibited convergence problems. Here we investigate learning from human interaction using the Bayesian Update of Dialogue State system. This dynamic Bayesian network based system has an optimisation space covering more than one hundred features, allowing a wide range of behaviours to be learned. Using an improved policy model and a more robust reward function, we show that stable learning can be achieved that significantly outperforms a simulator trained policy. © 2013 IEEE.
Resumo:
This study investigates the effect of thermal cycling on the performance of concrete beams retrofitted with CARDIFRC, a new class of high performance fiber-reinforced cement-based material that is compatible with concrete. Twenty four beams were subjected to 24 h thermal cycles between 25 and 90°C. One third of the beams were reinforced either in flexure only or in flexure and shear with conventional steel reinforcement and used as control specimens. The remaining sixteen beams were retrofitted with CARDIFRC strips to provide external flexural and/or shear strengthening. All beams were exposed to a varied number of 24 h thermal cycles ranging from 0 to 90 and were tested in four-point bending at room temperature. The tests indicated that the retrofitted members were stronger and stiffer than control beams, and more importantly, that their failure initiated in flexure without any signs of interfacial delamination cracking. The results of these tests are presented and compared to analytical predictions. The predictions show good correlation with the experimental results. © 2010 ASCE.
Resumo:
In order to account for interfacial friction of composite materials, an analytical model based on contact geometry and local friction is proposed. A contact area includes several types of microcontacts depending on reinforcement materials and their shape. A proportion between these areas is defined by in-plane contact geometry. The model applied to a fibre-reinforced composite results in the dependence of friction on surface fibre fraction and local friction coefficients. To validate this analytical model, an experimental study on carbon fibrereinforced epoxy composites under low normal pressure was performed. The effects of fibre volume fraction and fibre orientation were studied, discussed and compared with analytical model results. © Springer Science+Business Media, LLC 2012.
Resumo:
Copyright 2014 by the author(s). We present a nonparametric prior over reversible Markov chains. We use completely random measures, specifically gamma processes, to construct a countably infinite graph with weighted edges. By enforcing symmetry to make the edges undirected we define a prior over random walks on graphs that results in a reversible Markov chain. The resulting prior over infinite transition matrices is closely related to the hierarchical Dirichlet process but enforces reversibility. A reinforcement scheme has recently been proposed with similar properties, but the de Finetti measure is not well characterised. We take the alternative approach of explicitly constructing the mixing measure, which allows more straightforward and efficient inference at the cost of no longer having a closed form predictive distribution. We use our process to construct a reversible infinite HMM which we apply to two real datasets, one from epigenomics and one ion channel recording.
Resumo:
This paper compares a number of different moment-curvature models for cracked concrete sections that contain both steel and external fiber-reinforced polymer (FRP) reinforcement. The question of whether to use a whole-section analysis or one that considers the FRP separately is discussed. Five existing and three new models are compared with test data for moment-curvature or load deflection behavior, and five models are compared with test results for plate-end debonding using a global energy balance approach (GEBA). A proposal is made for the use of one of the simplified models. The availability of a simplified model opens the way to the production of design aids so that the GEBA can be made available to practicing engineers through design guides and parametric studies. Copyright © 2014, American Concrete Institute.
Resumo:
While underactuated robotic systems are capable of energy efficient and rapid dynamic behavior, we still do not fully understand how body dynamics can be actively used for adaptive behavior in complex unstructured environment. In particular, we can expect that the robotic systems could achieve high maneuverability by flexibly storing and releasing energy through the motor control of the physical interaction between the body and the environment. This paper presents a minimalistic optimization strategy of motor control policy for underactuated legged robotic systems. Based on a reinforcement learning algorithm, we propose an optimization scheme, with which the robot can exploit passive elasticity for hopping forward while maintaining the stability of locomotion process in the environment with a series of large changes of ground surface. We show a case study of a simple one-legged robot which consists of a servomotor and a passive elastic joint. The dynamics and learning performance of the robot model are tested in simulation, and then transferred the results to the real-world robot. ©2007 IEEE.
Resumo:
As observed in nature, complex locomotion can be generated based on an adequate combination of motor primitives. In this context, the paper focused on experiments which result in the development of a quality criterion for the design and analysis of motor primitives. First, the impact of different vocabularies on behavioural diversity, robustness of prelearned behaviours and learning process is elaborated. The experiments are performed with the quadruped robot MiniDog6M for which a running and standing up behaviour is implemented. Further, a reinforcement learning approach based on Q-learning is introduced which is used to select an adequate sequence of motor primitives. © 2006 Springer-Verlag Berlin Heidelberg.
Resumo:
Computer simulation experiments were performed to examine the effectiveness of OR- and comparative-reinforcement learning algorithms. In the simulation, human rewards were given as +1 and -1. Two models of human instruction that determine which reward is to be given in every step of a human instruction were used. Results show that human instruction may have a possibility of including both model-A and model-B characteristics, and it can be expected that the comparative-reinforcement learning algorithm is more effective for learning by human instructions.
Resumo:
The ability of large-grain (RE)Ba2Cu3O7-δ ((RE)BCO; RE = rare earth) bulk superconductors to trap magnetic fields is determined by their critical current. With high trapped fields, however, bulk samples are subject to a relatively large Lorentz force, and their performance is limited primarily by their tensile strength. Consequently, sample reinforcement is the key to performance improvement in these technologically important materials. In this work, we report a trapped field of 17.6 T, the largest reported to date, in a stack of two silver-doped GdBCO superconducting bulk samples, each 25 mm in diameter, fabricated by top-seeded melt growth and reinforced with shrink-fit stainless steel. This sample preparation technique has the advantage of being relatively straightforward and inexpensive to implement, and offers the prospect of easy access to portable, high magnetic fields without any requirement for a sustaining current source. © 2014 IOP Publishing Ltd.