19 resultados para Markov chains. Convergence. Evolutionary Strategy. Large Deviations
Resumo:
This paper explores platform strategies along the business ecosystem lifecycle (BELC), based on a multiple-case study. Developing observations on platform strategies from a firm level to a business ecosystem level, the study investigates the issue of platform strategy through three views, respectively technology, application and organisation. As a result, a general evolutional pattern of platform strategy along the BELC is identified, where an open strategy emerges at the birth and expansion phases, then a dominating strategy rises at the authority phase, and finally the opportunistic strategy takes over at the renewal phase. This paper connects the core firms in the business ecosystem with the evolutionary platform strategies. Copyright © 2013 Inderscience Enterprises Ltd.
Resumo:
We consider the inverse reinforcement learning problem, that is, the problem of learning from, and then predicting or mimicking a controller based on state/action data. We propose a statistical model for such data, derived from the structure of a Markov decision process. Adopting a Bayesian approach to inference, we show how latent variables of the model can be estimated, and how predictions about actions can be made, in a unified framework. A new Markov chain Monte Carlo (MCMC) sampler is devised for simulation from the posterior distribution. This step includes a parameter expansion step, which is shown to be essential for good convergence properties of the MCMC sampler. As an illustration, the method is applied to learning a human controller.
Resumo:
A partially observable Markov decision process has been proposed as a dialogue model that enables robustness to speech recognition errors and automatic policy optimisation using reinforcement learning (RL). However, conventional RL algorithms require a very large number of dialogues, necessitating a user simulator. Recently, Gaussian processes have been shown to substantially speed up the optimisation, making it possible to learn directly from interaction with human users. However, early studies have been limited to very low dimensional spaces and the learning has exhibited convergence problems. Here we investigate learning from human interaction using the Bayesian Update of Dialogue State system. This dynamic Bayesian network based system has an optimisation space covering more than one hundred features, allowing a wide range of behaviours to be learned. Using an improved policy model and a more robust reward function, we show that stable learning can be achieved that significantly outperforms a simulator trained policy. © 2013 IEEE.
Resumo:
In this paper, we present a study on electrical and optical characteristics of n-type tin-oxide nanowires integrated based on top-down scale-up strategy. Through a combination of contact printing and plasma based back-channel passivation, we have achieved stable electrical characteristics with standard deviation in mobility and threshold voltage of 9.1% and 25%, respectively, for a large area of 1× 1 cm2 area. Through use of contact printing, high alignment of nanowires was achieved thus minimizing the number of nanowire-nanowire junctions, which serve to limit carrier transport in the channel. In addition, persistent photoconductivity has been observed, which we attribute to oxygen vacancy ionization and subsequent elimination using a gate pulse driving scheme. © 2014 IEEE.