44 resultados para Strictly positive real systems
Resumo:
Reinforcement techniques have been successfully used to maximise the expected cumulative reward of statistical dialogue systems. Typically, reinforcement learning is used to estimate the parameters of a dialogue policy which selects the system's responses based on the inferred dialogue state. However, the inference of the dialogue state itself depends on a dialogue model which describes the expected behaviour of a user when interacting with the system. Ideally the parameters of this dialogue model should be also optimised to maximise the expected cumulative reward. This article presents two novel reinforcement algorithms for learning the parameters of a dialogue model. First, the Natural Belief Critic algorithm is designed to optimise the model parameters while the policy is kept fixed. This algorithm is suitable, for example, in systems using a handcrafted policy, perhaps prescribed by other design considerations. Second, the Natural Actor and Belief Critic algorithm jointly optimises both the model and the policy parameters. The algorithms are evaluated on a statistical dialogue system modelled as a Partially Observable Markov Decision Process in a tourist information domain. The evaluation is performed with a user simulator and with real users. The experiments indicate that model parameters estimated to maximise the expected reward function provide improved performance compared to the baseline handcrafted parameters. © 2011 Elsevier Ltd. All rights reserved.
Resumo:
On-body sensor systems for sport are challenging since the sensors must be lightweight and small to avoid discomfort, and yet robust and highly accurate to withstand and capture the fast movements associated with sport. In this work, we detail our experience of building such an on-body system for track athletes. The paper describes the design, implementation and deployment of an on-body sensor system for sprint training sessions. We autonomously profile sprints to derive quantitative metrics to improve training sessions. Inexpensive Force Sensitive Resistors (FSRs) are used to capture foot events that are subsequently analysed and presented back to the coach. We show how to identify periods of sprinting from the FSR data and how to compute metrics such as ground contact time. We evaluate our system using force plates and show that millisecond-level accuracy is achievable when estimating contact times. © 2012 Elsevier B.V. All rights reserved.
Resumo:
Statistical dialogue models have required a large number of dialogues to optimise the dialogue policy, relying on the use of a simulated user. This results in a mismatch between training and live conditions, and significant development costs for the simulator thereby mitigating many of the claimed benefits of such models. Recent work on Gaussian process reinforcement learning, has shown that learning can be substantially accelerated. This paper reports on an experiment to learn a policy for a real-world task directly from human interaction using rewards provided by users. It shows that a usable policy can be learnt in just a few hundred dialogues without needing a user simulator and, using a learning strategy that reduces the risk of taking bad actions. The paper also investigates adaptation behaviour when the system continues learning for several thousand dialogues and highlights the need for robustness to noisy rewards. © 2011 IEEE.
Resumo:
A number of recent scientific and engineering problems require signals to be decomposed into a product of a slowly varying positive envelope and a quickly varying carrier whose instantaneous frequency also varies slowly over time. Although signal processing provides algorithms for so-called amplitude-and frequency-demodulation (AFD), there are well known problems with all of the existing methods. Motivated by the fact that AFD is ill-posed, we approach the problem using probabilistic inference. The new approach, called probabilistic amplitude and frequency demodulation (PAFD), models instantaneous frequency using an auto-regressive generalization of the von Mises distribution, and the envelopes using Gaussian auto-regressive dynamics with a positivity constraint. A novel form of expectation propagation is used for inference. We demonstrate that although PAFD is computationally demanding, it outperforms previous approaches on synthetic and real signals in clean, noisy and missing data settings.
Resumo:
Speech recognition systems typically contain many Gaussian distributions, and hence a large number of parameters. This makes them both slow to decode speech, and large to store. Techniques have been proposed to decrease the number of parameters. One approach is to share parameters between multiple Gaussians, thus reducing the total number of parameters and allowing for shared likelihood calculation. Gaussian tying and subspace clustering are two related techniques which take this approach to system compression. These techniques can decrease the number of parameters with no noticeable drop in performance for single systems. However, multiple acoustic models are often used in real speech recognition systems. This paper considers the application of Gaussian tying and subspace compression to multiple systems. Results show that two speech recognition systems can be modelled using the same number of Gaussians as just one system, with little effect on individual system performance. Copyright © 2009 ISCA.
Resumo:
In this paper, a novel MPC strategy is proposed, and referred to as asso MPC. The new paradigm features an 1-regularised least squares loss function, in which the control error variance competes with the sum of input channels magnitude (or slew rate) over the whole horizon length. This cost choice is motivated by the successful development of LASSO theory in signal processing and machine learning. In the latter fields, sum-of-norms regularisation have shown a strong capability to provide robust and sparse solutions for system identification and feature selection. In this paper, a discrete-time dual-mode asso MPC is formulated, and its stability is proven by application of standard MPC arguments. The controller is then tested for the problem of ship course keeping and roll reduction with rudder and fins, in a directional stochastic sea. Simulations show the asso MPC to inherit positive features from its corresponding regressor: extreme reduction of decision variables' magnitude, namely, actuators' magnitude (or variations), with a finite energy error, being particularly promising for over-actuated systems. © 2012 AACC American Automatic Control Council).
Resumo:
This paper presents an adaptive Sequential Monte Carlo approach for real-time applications. Sequential Monte Carlo method is employed to estimate the states of dynamic systems using weighted particles. The proposed approach reduces the run-time computation complexity by adapting the size of the particle set. Multiple processing elements on FPGAs are dynamically allocated for improved energy efficiency without violating real-time constraints. A robot localisation application is developed based on the proposed approach. Compared to a non-adaptive implementation, the dynamic energy consumption is reduced by up to 70% without affecting the quality of solutions. © 2012 IEEE.
Resumo:
Ubiquitous in-building Real Time Location Systems (RTLS) today are limited by costly active radio frequency identification (RFID) tags and short range portal readers of low cost passive RFID tags. We, however, present a novel technology locates RFID tags using a new approach based on (a) minimising RFID fading using antenna diversity, frequency dithering, phase dithering and narrow beam-width antennas, (b) measuring a combination of RSSI and phase shift in the coherent received tag backscatter signals and (c) being selective of use of information from the system by, applying weighting techniques to minimise error. These techniques make it possible to locate tags to an accuracy of less than one metre. This breakthrough will enable, for the first time, the low-cost tagging of items and the possibility of locating them at relatively high precision.
Resumo:
Optical motion capture systems suffer from marker occlusions resulting in loss of useful information. This paper addresses the problem of real-time joint localisation of legged skeletons in the presence of such missing data. The data is assumed to be labelled 3d marker positions from a motion capture system. An integrated framework is presented which predicts the occluded marker positions using a Variable Turn Model within an Unscented Kalman filter. Inferred information from neighbouring markers is used as observation states; these constraints are efficient, simple, and real-time implementable. This work also takes advantage of the common case that missing markers are still visible to a single camera, by combining predictions with under-determined positions, resulting in more accurate predictions. An Inverse Kinematics technique is then applied ensuring that the bone lengths remain constant over time; the system can thereby maintain a continuous data-flow. The marker and Centre of Rotation (CoR) positions can be calculated with high accuracy even in cases where markers are occluded for a long period of time. Our methodology is tested against some of the most popular methods for marker prediction and the results confirm that our approach outperforms these methods in estimating both marker and CoR positions. © 2012 Springer-Verlag.
Resumo:
The next generation of diesel emission control devices includes 4-way catalyzed filtration systems (4WCFS) consisting of both NOx and diesel particulate matter (DPM) control. A methodology was developed to simultaneously evaluate the NOx and DPM control performance of miniature 4WCFS made from acicular mullite, an advanced ceramic material (ACM), that were challenged with diesel exhaust. The impact of catalyst loading and substrate porosity on catalytic performance of the NOx trap was evaluated. Simultaneously with NOx measurements, the real-time solid particle filtration performance of catalyst-coated standard and high porosity filters was determined for steady-state and regenerative conditions. The use of high porosity ACM 4-way catalyzed filtration systems reduced NOx by 99% and solid and total particulate matter by 95% when averaged over 10 regeneration cycles. A "regeneration cycle" refers to an oxidizing ("lean") exhaust condition followed by a reducing ("rich") exhaust condition resulting in NOx storage and NOx reduction (i.e., trap "regeneration"), respectively. Standard porosity ACM 4-way catalyzed filtration systems reduced NOx by 60-75% and exhibited 99.9% filtration efficiency. The rich/lean cycling used to regenerate the filter had almost no impact on solid particle filtration efficiency but impacted NOx control. Cycling resulted in the formation of very low concentrations of semivolatile nucleation mode particles for some 4WCFS formulations. Overall, 4WCFS show promise for significantly reducing diesel emissions into the atmosphere in a single control device. © 2013 American Chemical Society.
Resumo:
We present the development of a drug-loaded triple-layer platform consisting of thin film biodegradable polymers, in a properly designed form for the desired gradual degradation. Poly(dl-lactide-co-glycolide) (PLGA (65:35), PLGA (75:25)) and polycaprolactone (PCL) were grown by spin coating technique, to synthesize the platforms with the order PCL/PLGA (75:25)/PLGA (65:35) that determine their degradation rates. The outer PLGA (65:35) layer was loaded with dipyridamole, an antiplatelet drug. Spectroscopic ellipsometry (SE) in the Vis-far UV range was used to determine the nanostructure, as well as the content of the incorporated drug in the as-grown platforms. In situ and real-time SE measurements were carried out using a liquid cell for the dynamic evaluation of the fibrinogen and albumin protein adsorption processes. Atomic force microscopy studies justified the SE results concerning the nanopores formation in the polymeric platforms, and the dominant adsorption mechanisms of the proteins, which were defined by the drug incorporation in the platforms. © 2013 Elsevier B.V. All rights reserved.
Resumo:
Flow measurement data at the district meter area (DMA) level has the potential for burst detection in the water distribution systems. This work investigates using a polynomial function fitted to the historic flow measurements based on a weighted least-squares method for automatic burst detection in the U.K. water distribution networks. This approach, when used in conjunction with an expectationmaximization (EM) algorithm, can automatically select useful data from the historic flow measurements, which may contain normal and abnormal operating conditions in the distribution network, e.g., water burst. Thus, the model can estimate the normal water flow (nonburst condition), and hence the burst size on the water distribution system can be calculated from the difference between the measured flow and the estimated flow. The distinguishing feature of this method is that the burst detection is fully unsupervised, and the burst events that have occurred in the historic data do not affect the procedure and bias the burst detection algorithm. Experimental validation of the method has been carried out using a series of flushing events that simulate burst conditions to confirm that the simulated burst sizes are capable of being estimated correctly. This method was also applied to eight DMAs with known real burst events, and the results of burst detections are shown to relate to the water company's records of pipeline reparation work. © 2014 American Society of Civil Engineers.