550 resultados para Noninvasive temperature estimation
Resumo:
Gradient-based approaches to direct policy search in reinforcement learning have received much recent attention as a means to solve problems of partial observability and to avoid some of the problems associated with policy degradation in value-function methods. In this paper we introduce GPOMDP, a simulation-based algorithm for generating a biased estimate of the gradient of the average reward in Partially Observable Markov Decision Processes (POMDPs) controlled by parameterized stochastic policies. A similar algorithm was proposed by Kimura, Yamamura, and Kobayashi (1995). The algorithm's chief advantages are that it requires storage of only twice the number of policy parameters, uses one free parameter β ∈ [0,1) (which has a natural interpretation in terms of bias-variance trade-off), and requires no knowledge of the underlying state. We prove convergence of GPOMDP, and show how the correct choice of the parameter β is related to the mixing time of the controlled POMDP. We briefly describe extensions of GPOMDP to controlled Markov chains, continuous state, observation and control spaces, multiple-agents, higher-order derivatives, and a version for training stochastic policies with internal states. In a companion paper (Baxter, Bartlett, & Weaver, 2001) we show how the gradient estimates generated by GPOMDP can be used in both a traditional stochastic gradient algorithm and a conjugate-gradient procedure to find local optima of the average reward. ©2001 AI Access Foundation and Morgan Kaufmann Publishers. All rights reserved.
Resumo:
We consider complexity penalization methods for model selection. These methods aim to choose a model to optimally trade off estimation and approximation errors by minimizing the sum of an empirical risk term and a complexity penalty. It is well known that if we use a bound on the maximal deviation between empirical and true risks as a complexity penalty, then the risk of our choice is no more than the approximation error plus twice the complexity penalty. There are many cases, however, where complexity penalties like this give loose upper bounds on the estimation error. In particular, if we choose a function from a suitably simple convex function class with a strictly convex loss function, then the estimation error (the difference between the risk of the empirical risk minimizer and the minimal risk in the class) approaches zero at a faster rate than the maximal deviation between empirical and true risks. In this paper, we address the question of whether it is possible to design a complexity penalized model selection method for these situations. We show that, provided the sequence of models is ordered by inclusion, in these cases we can use tight upper bounds on estimation error as a complexity penalty. Surprisingly, this is the case even in situations when the difference between the empirical risk and true risk (and indeed the error of any estimate of the approximation error) decreases much more slowly than the complexity penalty. We give an oracle inequality showing that the resulting model selection method chooses a function with risk no more than the approximation error plus a constant times the complexity penalty.
Resumo:
We present a technique for estimating the 6DOF pose of a PTZ camera by tracking a single moving target in the image with known 3D position. This is useful in situations where it is not practical to measure the camera pose directly. Our application domain is estimating the pose of a PTZ camerso so that it can be used for automated GPS-based tracking and filming of UAV flight trials. We present results which show the technique is able to localize a PTZ after a short vision-tracked flight, and that the estimated pose is sufficiently accurate for the PTZ to then actively track a UAV based on GPS position data.
Resumo:
The effect of thermal radiation on a steady two-dimensional natural convection laminar flow of viscous incompressible optically thick fluid along a vertical flat plate with streamwise sinusoidal surface temperature has been investigated in this study. Using the appropriate variables; the basic governing equations are transformed to convenient form and then solved numerically employing two efficient methods, namely, Implicit finite difference method (IFD) together with Keller box scheme and Straight forward finite difference (SFFD) method. Effects of the variation of the physical parameters, for example, conduction-radiation parameter (Planck number), surface temperature parameter, and the amplitude of the surface temperature, are shown on the skin friction and heat transfer rate quantitatively are shown numerically. Velocity and temperature profiles as well as streamlines and isotherms are also presented and discussed for the variation of conduction-radiation parameter. It is found that both skin-friction and rate of heat transfer are enhanced considerably by increasing the values of conduction radiation parameter, Rd.
Resumo:
Laminar magnetohydrodynamic (MHD) natural convection flow from an isothermal sphere immersed in a fluid with viscosity proportional to linear function of temperature has been studied. The governing boundary layer equations are transformed into a non-dimensional form and the resulting nonlinear system of partial differential equations are reduced to convenient form which are solved numerically by two very efficient methods, namely, (i) Implicit finite difference method together with Keller box scheme and (ii) Direct numerical scheme. Numerical results are presented by velocity and temperature distribution, streamlines and isotherms of the fluid as well as heat transfer characteristics, namely the local skin-friction coefficients and the local heat transfer rate for a wide range of magnetohydrodynamic paramagnet and viscosity-variation parameter.
Resumo:
Background There has been increasing interest in assessing the impacts of temperature on mortality. However, few studies have used a case–crossover design to examine non-linear and distributed lag effects of temperature on mortality. Additionally, little evidence is available on the temperature-mortality relationship in China, or what temperature measure is the best predictor of mortality. Objectives To use a distributed lag non-linear model (DLNM) as a part of case–crossover design. To examine the non-linear and distributed lag effects of temperature on mortality in Tianjin, China. To explore which temperature measure is the best predictor of mortality; Methods: The DLNM was applied to a case¬−crossover design to assess the non-linear and delayed effects of temperatures (maximum, mean and minimum) on deaths (non-accidental, cardiopulmonary, cardiovascular and respiratory). Results A U-shaped relationship was consistently found between temperature and mortality. Cold effects (significantly increased mortality associated with low temperatures) were delayed by 3 days, and persisted for 10 days. Hot effects (significantly increased mortality associated with high temperatures) were acute and lasted for three days, and were followed by mortality displacement for non-accidental, cardiopulmonary, and cardiovascular deaths. Mean temperature was a better predictor of mortality (based on model fit) than maximum or minimum temperature. Conclusions In Tianjin, extreme cold and hot temperatures increased the risk of mortality. Results suggest that the effects of cold last longer than the effects of heat. It is possible to combine the case−crossover design with DLNMs. This allows the case−crossover design to flexibly estimate the non-linear and delayed effects of temperature (or air pollution) whilst controlling for season.
Resumo:
Estimates of the half-life to convergence of prices across a panel of cities are subject to bias from three potential sources: inappropriate cross-sectional aggregation of heterogeneous coefficients, presence of lagged dependent variables in a model with individual fixed effects, and time aggregation of commodity prices. This paper finds no evidence of heterogeneity bias in annual CPI data for 17 U.S. cities from 1918 to 2006, but correcting for the “Nickell bias” and time aggregation bias produces a half-life of 7.5 years, shorter than estimates from previous studies.