12 resultados para security policy model
em Cambridge University Engineering Department Publications Database
Resumo:
A partially observable Markov decision process has been proposed as a dialogue model that enables robustness to speech recognition errors and automatic policy optimisation using reinforcement learning (RL). However, conventional RL algorithms require a very large number of dialogues, necessitating a user simulator. Recently, Gaussian processes have been shown to substantially speed up the optimisation, making it possible to learn directly from interaction with human users. However, early studies have been limited to very low dimensional spaces and the learning has exhibited convergence problems. Here we investigate learning from human interaction using the Bayesian Update of Dialogue State system. This dynamic Bayesian network based system has an optimisation space covering more than one hundred features, allowing a wide range of behaviours to be learned. Using an improved policy model and a more robust reward function, we show that stable learning can be achieved that significantly outperforms a simulator trained policy. © 2013 IEEE.
Resumo:
The design of a sustainable electricity generation and transmission system is based on the established science of anthropogenic climate change and the realization that depending on imported fossil-fuels is becoming a measure of energy insecurity of supply. A model is proposed which integrates generation fuel mix composition, assignment of plants and optimized power flow, using Portugal as a case study. The result of this co-optimized approach is an overall set of generator types/fuels which increases the diversity of Portuguese electricity supply, lowers its dependency on imported fuels by 14.62% and moves the country towards meeting its regional and international obligations of 31% energy from renewables by 2020 and a 27% reduction in greenhouse gas emissions by 2012, respectively. The quantity and composition of power generation at each bus is specified, with particular focus on quantifying the amount of distributed generation. Based on other works, the resultant, overall distributed capacity penetration of 19.02% of total installed generation is expected to yield positive network benefits. Thus, the model demonstrates that national energy policy and technical deployment can be linked through sustainability and, moreover, that the respective goals may be mutually achieved via holistic, integrated design. ©2009 IEEE.
Resumo:
Over the past decade, a variety of user models have been proposed for user simulation-based reinforcement-learning of dialogue strategies. However, the strategies learned with these models are rarely evaluated in actual user trials and it remains unclear how the choice of user model affects the quality of the learned strategy. In particular, the degree to which strategies learned with a user model generalise to real user populations has not be investigated. This paper presents a series of experiments that qualitatively and quantitatively examine the effect of the user model on the learned strategy. Our results show that the performance and characteristics of the strategy are in fact highly dependent on the user model. Furthermore, a policy trained with a poor user model may appear to perform well when tested with the same model, but fail when tested with a more sophisticated user model. This raises significant doubts about the current practice of learning and evaluating strategies with the same user model. The paper further investigates a new technique for testing and comparing strategies directly on real human-machine dialogues, thereby avoiding any evaluation bias introduced by the user model. © 2005 IEEE.
Resumo:
The development of health policy is recognized as complex; however, there has been little development of the role of agency in this process. Kingdon developed the concept of policy entrepreneur (PE) within his ‘windows’ model. He argued inter-related ‘policy streams' must coincide for important issues to become addressed. The conjoining of these streams may be aided by a policy entrepreneur. We contribute by clarifying the role of the policy entrepreneur and highlighting the translational processes of key actors in creating and aligning policy windows. We analyse the work in London of Professor Sir Ara Darzi as a policy entrepreneur. An important aspect of Darzi's approach was to align a number of important institutional networks to conjoin related problems. Our findings highlight how a policy entrepreneur not only opens policy windows but also yokes together a network to make policy agendas happen. Our contribution reveals the role of clinical leadership in health reform.
Resumo:
There is growing interest in Discovery Services for locating RFID and supply chain data between companies globally, to obtain product lifecycle information for individual objects. Discovery Services are heralded as a means to find serial-level data from previously unknown parties, however more realistically they provide a means to reduce the communications load on the information services, the network and the requesting client application. Attempts to design a standardised Discovery Service will not succeed unless security is considered in every aspect of the design. In this paper we clearly show that security cannot be bolted-on in the form of access control, although this is also required. The basic communication model of the Discovery Service critically affects who shares what data with whom, and what level of trust is required between the interacting parties. © 2009 IEEE.
Resumo:
This paper develops a technique for improving the region of attraction of a robust variable horizon model predictive controller. It considers a constrained discrete-time linear system acted upon by a bounded, but unknown time-varying state disturbance. Using constraint tightening for robustness, it is shown how the tightening policy, parameterised as direct feedback on the disturbance, can be optimised to increase the volume of an inner approximation to the controller's true region of attraction. Numerical examples demonstrate the benefits of the policy in increasing region of attraction volume and decreasing the maximum prediction horizon length. © 2012 IEEE.
Resumo:
The partially observable Markov decision process (POMDP) has been proposed as a dialogue model that enables automatic improvement of the dialogue policy and robustness to speech understanding errors. It requires, however, a large number of dialogues to train the dialogue policy. Gaussian processes (GP) have recently been applied to POMDP dialogue management optimisation showing an ability to substantially increase the speed of learning. Here, we investigate this further using the Bayesian Update of Dialogue State dialogue manager. We show that it is possible to apply Gaussian processes directly to the belief state, removing the need for a parametric policy representation. In addition, the resulting policy learns significantly faster while maintaining operational performance. © 2012 IEEE.
Resumo:
This paper describes a novel approach to the analysis of supply and demand of water in California. A stochastic model is developed to assess the future supply of and demand for water resources in California. The results are presented in the form of a Sankey diagram where present and stochastically-varying future fluxes of water in California and its sub-regions are traced from source to services by mapping the various transformations of water from when it is first made available for use, through its treatment, recycling and reuse, to its eventual loss in a variety of sinks. This helps to highlight the connections of water with energy and land resources, including the amount of energy used to pump and treat water, the amount of water used for energy production, and the land resources that create a water demand to produce crops for food. By mapping water in this way, policy-makers can more easily understand the competing uses of water, through the identification of the services it delivers (e.g. sanitation, food production, landscaping), the potential opportunities for improving themanagement of the resource and the connections with other resources which are often overlooked in a traditional sector-based management strategy. This paper focuses on a Sankey diagram for water, but the ultimate aim is the visualisation of linked resource futures through inter-connected Sankey diagrams for energy, land and water, tracking changes from the basic resources for all three, their transformations, and the final services they provide.
Resumo:
Although it is widely believed that reinforcement learning is a suitable tool for describing behavioral learning, the mechanisms by which it can be implemented in networks of spiking neurons are not fully understood. Here, we show that different learning rules emerge from a policy gradient approach depending on which features of the spike trains are assumed to influence the reward signals, i.e., depending on which neural code is in effect. We use the framework of Williams (1992) to derive learning rules for arbitrary neural codes. For illustration, we present policy-gradient rules for three different example codes - a spike count code, a spike timing code and the most general "full spike train" code - and test them on simple model problems. In addition to classical synaptic learning, we derive learning rules for intrinsic parameters that control the excitability of the neuron. The spike count learning rule has structural similarities with established Bienenstock-Cooper-Munro rules. If the distribution of the relevant spike train features belongs to the natural exponential family, the learning rules have a characteristic shape that raises interesting prediction problems.