96 resultados para Critic test
Resumo:
The Spoken Dialog Challenge 2010 was an exercise to investigate how different spoken dialog systems perform on the same task. The existing Let's Go Pittsburgh Bus Information System was used as a task and four teams provided systems that were first tested in controlled conditions with speech researchers as users. The three most stable systems were then deployed to real callers. This paper presents the results of the live tests, and compares them with the control test results. Results show considerable variation both between systems and between the control and live tests. Interestingly, relatively high task completion for controlled tests did not always predict relatively high task completion for live tests. Moreover, even though the systems were quite different in their designs, we saw very similar correlations between word error rate and task completion for all the systems. The dialog data collected is available to the research community. © 2011 Association for Computational Linguistics.
Resumo:
Deformations of sandy soils around geotechnical structures generally involve strains in the range small (0·01%) to medium (0·5%). In this strain range the soil exhibits non-linear stress-strain behaviour, which should be incorporated in any deformation analysis. In order to capture the possible variability in the non-linear behaviour of various sands, a database was constructed including the secant shear modulus degradation curves of 454 tests from the literature. By obtaining a unique S-shaped curve of shear modulus degradation, a modified hyperbolic relationship was fitted. The three curve-fitting parameters are: an elastic threshold strain γe, up to which the elastic shear modulus is effectively constant at G0; a reference strain γr, defined as the shear strain at which the secant modulus has reduced to 0·5G0; and a curvature parameter a, which controls the rate of modulus reduction. The two characteristic strains γe and γr were found to vary with sand type (i.e. uniformity coefficient), soil state (i.e. void ratio, relative density) and mean effective stress. The new empirical expression for shear modulus reduction G/G0 is shown to make predictions that are accurate within a factor of 1·13 for one standard deviation of random error, as determined from 3860 data points. The initial elastic shear modulus, G0, should always be measured if possible, but a new empirical relation is shown to provide estimates within a factor of 1·6 for one standard deviation of random error, as determined from 379 tests. The new expressions for non-linear deformation are easy to apply in practice, and should be useful in the analysis of geotechnical structures under static loading.
Crashworthiness of helicopters on water: Test and simulation of a full-scale WG30 impacting on water
Resumo:
Animals repeat rewarded behaviors, but the physiological basis of reward-based learning has only been partially elucidated. On one hand, experimental evidence shows that the neuromodulator dopamine carries information about rewards and affects synaptic plasticity. On the other hand, the theory of reinforcement learning provides a framework for reward-based learning. Recent models of reward-modulated spike-timing-dependent plasticity have made first steps towards bridging the gap between the two approaches, but faced two problems. First, reinforcement learning is typically formulated in a discrete framework, ill-adapted to the description of natural situations. Second, biologically plausible models of reward-modulated spike-timing-dependent plasticity require precise calculation of the reward prediction error, yet it remains to be shown how this can be computed by neurons. Here we propose a solution to these problems by extending the continuous temporal difference (TD) learning of Doya (2000) to the case of spiking neurons in an actor-critic network operating in continuous time, and with continuous state and action representations. In our model, the critic learns to predict expected future rewards in real time. Its activity, together with actual rewards, conditions the delivery of a neuromodulatory TD signal to itself and to the actor, which is responsible for action choice. In simulations, we show that such an architecture can solve a Morris water-maze-like navigation task, in a number of trials consistent with reported animal performance. We also use our model to solve the acrobot and the cartpole problems, two complex motor control tasks. Our model provides a plausible way of computing reward prediction error in the brain. Moreover, the analytically derived learning rule is consistent with experimental evidence for dopamine-modulated spike-timing-dependent plasticity.