43 resultados para simulator
Resumo:
Reinforcement techniques have been successfully used to maximise the expected cumulative reward of statistical dialogue systems. Typically, reinforcement learning is used to estimate the parameters of a dialogue policy which selects the system's responses based on the inferred dialogue state. However, the inference of the dialogue state itself depends on a dialogue model which describes the expected behaviour of a user when interacting with the system. Ideally the parameters of this dialogue model should be also optimised to maximise the expected cumulative reward. This article presents two novel reinforcement algorithms for learning the parameters of a dialogue model. First, the Natural Belief Critic algorithm is designed to optimise the model parameters while the policy is kept fixed. This algorithm is suitable, for example, in systems using a handcrafted policy, perhaps prescribed by other design considerations. Second, the Natural Actor and Belief Critic algorithm jointly optimises both the model and the policy parameters. The algorithms are evaluated on a statistical dialogue system modelled as a Partially Observable Markov Decision Process in a tourist information domain. The evaluation is performed with a user simulator and with real users. The experiments indicate that model parameters estimated to maximise the expected reward function provide improved performance compared to the baseline handcrafted parameters. © 2011 Elsevier Ltd. All rights reserved.
Resumo:
This paper presents an agenda-based user simulator which has been extended to be trainable on real data with the aim of more closely modelling the complex rational behaviour exhibited by real users. The train-able part is formed by a set of random decision points that may be encountered during the process of receiving a system act and responding with a user act. A sample-based method is presented for using real user data to estimate the parameters that control these decisions. Evaluation results are given both in terms of statistics of generated user behaviour and the quality of policies trained with different simulators. Compared to a handcrafted simulator, the trained system provides a much better fit to corpus data and evaluations suggest that this better fit should result in improved dialogue performance. © 2010 Association for Computational Linguistics.
Resumo:
Statistical dialogue models have required a large number of dialogues to optimise the dialogue policy, relying on the use of a simulated user. This results in a mismatch between training and live conditions, and significant development costs for the simulator thereby mitigating many of the claimed benefits of such models. Recent work on Gaussian process reinforcement learning, has shown that learning can be substantially accelerated. This paper reports on an experiment to learn a policy for a real-world task directly from human interaction using rewards provided by users. It shows that a usable policy can be learnt in just a few hundred dialogues without needing a user simulator and, using a learning strategy that reduces the risk of taking bad actions. The paper also investigates adaptation behaviour when the system continues learning for several thousand dialogues and highlights the need for robustness to noisy rewards. © 2011 IEEE.
Resumo:
Elderly and disabled people can be hugely benefited through the advancement of modern electronic devices, as those can help them to engage more fully with the world. However, existing design practices often isolate elderly or disabled users by considering them as users with special needs. This article presents a simulator that can reflect problems faced by elderly and disabled users while they use computer, television, and similar electronic devices. The simulator embodies both the internal state of an application and the perceptual, cognitive, and motor processes of its user. It can help interface designers to understand, visualize, and measure the effect of impairment on interaction with an interface. Initially a brief survey of different user modeling techniques is presented, and then the existing models are classified into different categories. In the context of existing modeling approaches the work on user modeling is presented for people with a wide range of abilities. A few applications of the simulator, which shows the predictions are accurate enough to make design choices and point out the implication and limitations of the work, are also discussed. © 2012 Copyright Taylor and Francis Group, LLC.
Resumo:
An existing driver-vehicle model with neuromuscular dynamics is improved in the areas of cognitive delay, intrinsic muscle dynamics and alpha-gamma co-activation. The model is used to investigate the influence of steering torque feedback and neuromuscular dynamics on the vehicle response to lateral force disturbances. When steering torque feedback is present, it is found that the longitudinal position of the lateral disturbance has a significant influence on whether the drivers reflex response reinforces or attenuates the effect of the disturbance. The response to angle and torque overlay inputs to the steering system is also investigated. The presence of the steering torque feedback reduced the disturbing effect of torque overlay and angle overlay inputs. Reflex action reduced the disturbing effect of a torque overlay input, but increased the disturbing effect of an angle overlay input. Experiments on a driving simulator showed that measured handwheel angle response to an angle overlay input was consistent with the response predicted by the model with reflex action. However, there was significant intra-and inter-subject variability. The results highlight the significance of a drivers neuromuscular dynamics in determining the vehicle response to disturbances. © 2012 Copyright Taylor and Francis Group, LLC.
Resumo:
We describe the design steps and final implementation of a MIMO OFDM prototype platform developed to enhance the performance of wireless LAN standards such as HiperLAN/2 and 802.11, using multiple transmit and multiple receive antennas. We first describe the channel measurement campaign used to characterize the indoor operational propagation environment, and analyze the influence of the channel on code design through a ray-tracing channel simulator. We also comment on some antenna and RF issues which are of importance for the final realization of the testbed. Multiple coding, decoding, and channel estimation strategies are discussed and their respective performance-complexity trade-offs are evaluated over the realistic channel obtained from the propagation studies. Finally,we present the design methodology, including cross-validation of the Matlab, C++, and VHDL components, and the final demonstrator architecture. We highlight the increased measured performance of the MIMO testbed over the single-antenna system. £.
Resumo:
The optimization of dialogue policies using reinforcement learning (RL) is now an accepted part of the state of the art in spoken dialogue systems (SDS). Yet, it is still the case that the commonly used training algorithms for SDS require a large number of dialogues and hence most systems still rely on artificial data generated by a user simulator. Optimization is therefore performed off-line before releasing the system to real users. Gaussian Processes (GP) for RL have recently been applied to dialogue systems. One advantage of GP is that they compute an explicit measure of uncertainty in the value function estimates computed during learning. In this paper, a class of novel learning strategies is described which use uncertainty to control exploration on-line. Comparisons between several exploration schemes show that significant improvements to learning speed can be obtained and that rapid and safe online optimisation is possible, even on a complex task. Copyright © 2011 ISCA.
Resumo:
In this paper we compare different approaches to calculating the charge density in the 2DEG layer of AlGaN/GaN HEMTs. The methods used are (i) analytical theory implemented in MATLAB, (ii) finite-element analysis using semiconductor TCAD software that implements only the Poisson and continuity equations, and (iii) 1D software that solves the Poisson and Schrödinger equations self-consistently. By using the 1D Poisson-Schrödinger solver, we highlight the consequences of neglecting the Schrödinger equation. We conclude that the TCAD simulator predicts with a reasonable level of accuracy the electron density in the 2DEG layer for both a conventional HEMT structure and one featuring an extra GaN cap layer. In addition, while the sheet charge density is not significantly affected by including Schrödinger, its confinement in the channel is found to be modified. © 2012 IEEE.
Resumo:
Capability loss simulators give designers a brief experience of some of the functional effects of capability loss. They are an effective method of helping people to understand the impact of capability loss on product use. However, it is also important that designers know what levels of loss are being simulated and how they relate to the user population. The study in this paper tested the Cambridge Simulation Glasses with 25 participants to determine the effect of different numbers of glasses on a person's visual acuity. This data is also related to the glasses' use in usability assessment. A procedure is described for determining the number of simulator glasses with which the visual detail on a product is just visible. This paper then explains how to calculate the proportion of the UK population who would be unable to distinguish that detail.
Resumo:
Managing change can be challenging due to the high levels of interdependency in concurrent engineering processes. A key activity in engineering change management is propagation analysis, which can be supported using the change prediction method. In common with most other change prediction approaches, the change prediction method has three important limitations: L1: it depends on highly subjective input data; L2: it is capable of modelling 'generalised cases' only and cannot be; customised to assess specific changes; and L3: the input data are static, and thus, guidance does not reflect changes in the design. This article contributes to resolving these limitations by incorporating interface information into the change prediction method. The enhanced method is illustrated using an example based on a flight simulator. © The Author(s) 2013.
Resumo:
A partially observable Markov decision process has been proposed as a dialogue model that enables robustness to speech recognition errors and automatic policy optimisation using reinforcement learning (RL). However, conventional RL algorithms require a very large number of dialogues, necessitating a user simulator. Recently, Gaussian processes have been shown to substantially speed up the optimisation, making it possible to learn directly from interaction with human users. However, early studies have been limited to very low dimensional spaces and the learning has exhibited convergence problems. Here we investigate learning from human interaction using the Bayesian Update of Dialogue State system. This dynamic Bayesian network based system has an optimisation space covering more than one hundred features, allowing a wide range of behaviours to be learned. Using an improved policy model and a more robust reward function, we show that stable learning can be achieved that significantly outperforms a simulator trained policy. © 2013 IEEE.
Resumo:
This letter demonstrates for the first time the effect of the incomplete ionization (I.I.) of the transparent p-anode layer on the static and dynamic characteristics of the field-stop insulated gate bipolar transistors (FS IGBTs). This effect needs to be considered in FS IGBTs TCAD modeling to match accurately the device characteristics across a wide range of temperatures. The acceptor ionization energy (EA) governing the I.I. mechanism for the p-anode is extracted via matching the experimental turn-off waveforms and the static performance with Medici simulator. © 1980-2012 IEEE.
Resumo:
A partially observable Markov decision process (POMDP) has been proposed as a dialog model that enables automatic optimization of the dialog policy and provides robustness to speech understanding errors. Various approximations allow such a model to be used for building real-world dialog systems. However, they require a large number of dialogs to train the dialog policy and hence they typically rely on the availability of a user simulator. They also require significant designer effort to hand-craft the policy representation. We investigate the use of Gaussian processes (GPs) in policy modeling to overcome these problems. We show that GP policy optimization can be implemented for a real world POMDP dialog manager, and in particular: 1) we examine different formulations of a GP policy to minimize variability in the learning process; 2) we find that the use of GP increases the learning rate by an order of magnitude thereby allowing learning by direct interaction with human users; and 3) we demonstrate that designer effort can be substantially reduced by basing the policy directly on the full belief space thereby avoiding ad hoc feature space modeling. Overall, the GP approach represents an important step forward towards fully automatic dialog policy optimization in real world systems. © 2013 IEEE.