13 resultados para One free bad policy
em Cambridge University Engineering Department Publications Database
Resumo:
It is shown experimentally that an elastic mechanical stress in a crystal structure is a necessary factor for the appearance of free oscillations of the director of a ferroelectric liquid crystal. Such a mechanical stress arises as a result of internal textural perturbations in the presence of regions with a different orientation of the director or is produced by external pressure applied to one of the cell plates in the appropriate direction. © 1999 American Institute of Physics.
Resumo:
This paper presents a practical destruction-free parameter extraction methodology for a new physics-based circuit simulator buffer-layer Integrated Gate Commutated Thyristor (IGCT) model. Most key parameters needed for this model can be extracted by one simple clamped inductive-load switching experiment. To validate this extraction method, a clamped inductive load switching experiment was performed, and corresponding simulations were carried out by employing the IGCT model with parameters extracted through the presented methodology. Good agreement has been obtained between the experimental data and simulation results.
Resumo:
Statistical dialogue models have required a large number of dialogues to optimise the dialogue policy, relying on the use of a simulated user. This results in a mismatch between training and live conditions, and significant development costs for the simulator thereby mitigating many of the claimed benefits of such models. Recent work on Gaussian process reinforcement learning, has shown that learning can be substantially accelerated. This paper reports on an experiment to learn a policy for a real-world task directly from human interaction using rewards provided by users. It shows that a usable policy can be learnt in just a few hundred dialogues without needing a user simulator and, using a learning strategy that reduces the risk of taking bad actions. The paper also investigates adaptation behaviour when the system continues learning for several thousand dialogues and highlights the need for robustness to noisy rewards. © 2011 IEEE.
Resumo:
Atmospheric effects can significantly degrade the reliability of free-space optical communications. One such effect is scintillation, caused by atmospheric turbulence, refers to random fluctuations in the irradiance and phase of the received laser beam. In this paper we inv stigate the use of multiple lasers and multiple apertures to mitigate scintillation. Since the scintillation process is slow, we adopt a block fading channel model and study the outage probability under the assumptions of orthogonal pulse-position modulation and non-ideal photodetection. Assuming perfect receiver channel state information (CSI), we derive the signal-to-noise ratio (SNR) exponents for the cases when the scintillation is lognormal, exponential and gammagamma distributed, which cover a wide range of atmospheric turbulence conditions. Furthermore, when CSI is also available at the transmitter, we illustrate very large gains in SNR are possible (in some cases larger than 15 dB) by adapting the transmitted power. Under a long-term power constraint, we outline fundamental design criteria via a simple expression that relates the required number of lasers and apertures for a given code rate and number of codeword blocks to completely remove system outages. Copyright © 2009 IEEE.
Resumo:
OBJECTIVE: A standard view in health economics is that, although there is no market that determines the "prices" for health states, people can nonetheless associate health states with monetary values (or other scales, such as quality adjusted life year [QALYs] and disability adjusted life year [DALYs]). Such valuations can be used to shape health policy, and a major research challenge is to elicit such values from people; creating experimental "markets" for health states is a theoretically attractive way to address this. We explore the possibility that this framework may be fundamentally flawed-because there may not be any stable values to be revealed. Instead, perhaps people construct ad hoc values, influenced by contextual factors, such as the observed decisions of others. METHOD: The participants bid to buy relief from equally painful electrical shocks to the leg and arm in an experimental health market based on an interactive second-price auction. Thirty subjects were randomly assigned to two experimental conditions where the bids by "others" were manipulated to follow increasing or decreasing price trends for one, but not the other, pain. After the auction, a preference test asked the participants to choose which pain they prefer to experience for a longer duration. RESULTS: Players remained indifferent between the two pain-types throughout the auction. However, their bids were differentially attracted toward what others bid for each pain, with overbidding during decreasing prices and underbidding during increasing prices. CONCLUSION: Health preferences are dissociated from market prices, which are strongly referenced to others' choices. This suggests that the price of health care in a free-market has the capacity to become critically detached from people's underlying preferences.
Resumo:
Taper-free and vertically oriented Ge nanowires were grown on Si (111) substrates by chemical vapor deposition with Au nanoparticle catalysts. To achieve vertical nanowire growth on the highly lattice mismatched Si substrate, a thin Ge buffer layer was first deposited, and to achieve taper-free nanowire growth, a two-temperature process was employed. The two-temperature process consisted of a brief initial base growth step at high temperature followed by prolonged growth at lower temperature. Taper-free and defect-free Ge nanowires grew successfully even at 270 °C, which is 90 °C lower than the bulk eutectic temperature. The yield of vertical and taper-free nanowires is over 90%, comparable to that of vertical but tapered nanowires grown by the conventional one-temperature process. This method is of practical importance and can be reliably used to develop novel nanowire-based devices on relatively cheap Si substrates. Additionally, we observed that the activation energy of Ge nanowire growth by the two-temperature process is dependent on Au nanoparticle size. The low activation energy (∼5 kcal/mol) for 30 and 50 nm diameter Au nanoparticles suggests that the decomposition of gaseous species on the catalytic Au surface is a rate-limiting step. A higher activation energy (∼14 kcal/mol) was determined for 100 nm diameter Au nanoparticles which suggests that larger Au nanoparticles are partially solidified and that growth kinetics become the rate-limiting step. © 2011 American Chemical Society.
Resumo:
We demonstrate vertically aligned epitaxial GaAs nanowires of excellent crystallographic quality and optimal shape, grown by Au nanoparticle-catalyzed metalorganic chemical vapor deposition. This is achieved by a two-temperature growth procedure, consisting of a brief initial high-temperature growth step followed by prolonged growth at a lower temperature. The initial high-temperature step is essential for obtaining straight, vertically aligned epitaxial nanowires on the (111)B GaAs substrate. The lower temperature employed for subsequent growth imparts superior nanowire morphology and crystallographic quality by minimizing radial growth and eliminating twinning defects. Photoluminescence measurements confirm the excellent optical quality of these two-temperature grown nanowires. Two mechanisms are proposed to explain the success of this two-temperature growth process, one involving Au nanoparticle-GaAs interface conditions and the other involving melting-solidification temperature hysteresis of the Au-Ga nanoparticle alloy.
Resumo:
CW and time-resolved photoluminescence measurements are used to investigate exciton recombination dynamics in GaAsAlGaAs heterostructure nanowires grown with a recently developed technique which minimizes twinning. A thin capping layer is deposited to eliminate the possibility of oxidation of the AlGaAs shell as a source of oxygen defects in the GaAs core. We observe exciton lifetimes of ∼1 ns, comparable to high quality two-dimensional double heterostructures. These GaAs nanowires allow one to observe state filling and many-body effects resulting from the increased carrier densities accessible with pulsed laser excitation. © 2008 American Institute of Physics.
Resumo:
We demonstrate how a prior assumption of smoothness can be used to enhance the reconstruction of free energy profiles from multiple umbrella sampling simulations using the Bayesian Gaussian process regression approach. The method we derive allows the concurrent use of histograms and free energy gradients and can easily be extended to include further data. In Part I we review the necessary theory and test the method for one collective variable. We demonstrate improved performance with respect to the weighted histogram analysis method and obtain meaningful error bars without any significant additional computation. In Part II we consider the case of multiple collective variables and compare to a reconstruction using least squares fitting of radial basis functions. We find substantial improvements in the regimes of spatially sparse data or short sampling trajectories. A software implementation is made available on www.libatoms.org.
Resumo:
A partially observable Markov decision process has been proposed as a dialogue model that enables robustness to speech recognition errors and automatic policy optimisation using reinforcement learning (RL). However, conventional RL algorithms require a very large number of dialogues, necessitating a user simulator. Recently, Gaussian processes have been shown to substantially speed up the optimisation, making it possible to learn directly from interaction with human users. However, early studies have been limited to very low dimensional spaces and the learning has exhibited convergence problems. Here we investigate learning from human interaction using the Bayesian Update of Dialogue State system. This dynamic Bayesian network based system has an optimisation space covering more than one hundred features, allowing a wide range of behaviours to be learned. Using an improved policy model and a more robust reward function, we show that stable learning can be achieved that significantly outperforms a simulator trained policy. © 2013 IEEE.
Resumo:
A time multiplexed rectangular Zernike modal wavefront sensor based on a nematic phase-only liquid crystal spatial light modulator and specially designed for a high power two-electrode tapered laser diode which is a compact and novel free space optical communication source is used in an adaptive beam steering free space optical communication system, enabling the system to have 1.25 GHz modulation bandwidth, 4.6° angular coverage and the capability of sensing aberrations within the system and caused by atmosphere turbulence up to absolute value of 0.15 waves amplitude and correcting them in one correction cycle. Closed-loop aberration correction algorithm can be implemented to provide convergence for larger and time varying aberrations. Improvement of the system signal-to-noise-ratio performance is achieved by aberration correction. To our knowledge, it is first time to use rectangular orthonormal Zernike polynomials to represent balanced aberrations for high power rectangular laser beam in practice. © 2014 IEEE.
Resumo:
While underactuated robotic systems are capable of energy efficient and rapid dynamic behavior, we still do not fully understand how body dynamics can be actively used for adaptive behavior in complex unstructured environment. In particular, we can expect that the robotic systems could achieve high maneuverability by flexibly storing and releasing energy through the motor control of the physical interaction between the body and the environment. This paper presents a minimalistic optimization strategy of motor control policy for underactuated legged robotic systems. Based on a reinforcement learning algorithm, we propose an optimization scheme, with which the robot can exploit passive elasticity for hopping forward while maintaining the stability of locomotion process in the environment with a series of large changes of ground surface. We show a case study of a simple one-legged robot which consists of a servomotor and a passive elastic joint. The dynamics and learning performance of the robot model are tested in simulation, and then transferred the results to the real-world robot. ©2007 IEEE.