940 resultados para VARIABLE LENGTH MARKOV CHAINS
Resumo:
Gradient-based approaches to direct policy search in reinforcement learning have received much recent attention as a means to solve problems of partial observability and to avoid some of the problems associated with policy degradation in value-function methods. In this paper we introduce GPOMDP, a simulation-based algorithm for generating a biased estimate of the gradient of the average reward in Partially Observable Markov Decision Processes (POMDPs) controlled by parameterized stochastic policies. A similar algorithm was proposed by Kimura, Yamamura, and Kobayashi (1995). The algorithm's chief advantages are that it requires storage of only twice the number of policy parameters, uses one free parameter β ∈ [0,1) (which has a natural interpretation in terms of bias-variance trade-off), and requires no knowledge of the underlying state. We prove convergence of GPOMDP, and show how the correct choice of the parameter β is related to the mixing time of the controlled POMDP. We briefly describe extensions of GPOMDP to controlled Markov chains, continuous state, observation and control spaces, multiple-agents, higher-order derivatives, and a version for training stochastic policies with internal states. In a companion paper (Baxter, Bartlett, & Weaver, 2001) we show how the gradient estimates generated by GPOMDP can be used in both a traditional stochastic gradient algorithm and a conjugate-gradient procedure to find local optima of the average reward. ©2001 AI Access Foundation and Morgan Kaufmann Publishers. All rights reserved.
Resumo:
This paper presents a novel framework to further advance the recent trend of using query decomposition and high-order term relationships in query language modeling, which takes into account terms implicitly associated with different subsets of query terms. Existing approaches, most remarkably the language model based on the Information Flow method are however unable to capture multiple levels of associations and also suffer from a high computational overhead. In this paper, we propose to compute association rules from pseudo feedback documents that are segmented into variable length chunks via multiple sliding windows of different sizes. Extensive experiments have been conducted on various TREC collections and our approach significantly outperforms a baseline Query Likelihood language model, the Relevance Model and the Information Flow model.
Resumo:
Nanoparticle manipulation by various plasma forces in near-substrate areas of the Integrated Plasma-Aided Nanofabrication Facility (IPANF) is investigated. In the IPANF, high-density plasmas of low-temperature rf glow discharges are sustained. The model near-substrate area includes a variable-length pre-sheath, where a negatively charged nanoparticle is accelerated, and a self-consistent collisionless sheath with a repulsive electrostatic potential. Conditions enabling the nanoparticle to overcome the repulsive barrier and deposit onto the substrate are investigated numerically and experimentally. Under certain conditions the momentum gained by the nanoparticle in the pre-sheath area appears to be sufficient for the driving ion drag force to outbalance the repulsive electrostatic and thermophoretic forces. Numerical results are applied for the explanation of size-selective nanoparticle deposition in the Ar+H2+CH4 plasma-assisted chemical vapor deposition of various carbon nanostructure patterns for electron field emitters and are cross-referenced by the field emission scanning electron microscopy. It is shown that the nanoparticles can be efficiently manipulated by the temperature gradient-controlled thermophoretic force. Experimentally, the temperature gradients in the near-substrate areas are measured in situ by means of the temperature gradient probe and related to the nanofilm fabrication conditions. The results are relevant to plasma-assisted synthesis of numerous nanofilms employing structural incorporation of the plasma-grown nanoparticles, including but not limited to nanofabrication of ordered single-crystalline carbon nanotip arrays for electron field emission applications.
Resumo:
Manipulation of a single nanoparticle in the near-substrate areas of high-density plasmas of low-temperature glow discharges is studied. It is shown that the nanoparticles can be efficiently manipulated by the thermophoretic force controlled by external heating of the substrate stage. Particle deposition onto or repulsion from nanostructured carbon surfaces critically depends on the values of the neutral gas temperature gradient in the near-substrate areas, which is directly measured in situ in different heating regimes by originally developed temperature gradient probe. The measured values of the near-surface temperature gradient are used in the numerical model of nanoparticle dynamics in a variable-length presheath. Specific conditions enabling the nanoparticle to overcome the repulsive potential and deposit on the substrate during the discharge operation are investigated. The results are relevant to fabrication of various nanostructured films employing structural incorporation of the plasma-grown nanoparticles, in particular, to nanoparticle deposition in the plasma-enhanced chemical-vapor deposition of carbon nanostructures in hydrocarbon-based plasmas.
Resumo:
In this paper new online adaptive hidden Markov model (HMM) state estimation schemes are developed, based on extended least squares (ELS) concepts and recursive prediction error (RPE) methods. The best of the new schemes exploit the idempotent nature of Markov chains and work with a least squares prediction error index, using a posterior estimates, more suited to Markov models then traditionally used in identification of linear systems.
Resumo:
This article discusses the design and development of GRDB (General Purpose Relational Data Base System) which has been implemented on a DEC-1090 system in Pascal. GRDB is a general purpose database system designed to be completely independent of the nature of data to be handled, since it is not tailored to the specific requirements of any particular enterprise. It can handle different types of data such as variable length records and textual data. Apart from the usual database facilities such as data definition and data manipulation, GRDB supports User Definition Language (UDL) and Security definition language. These facilities are provided through a SEQUEL-like General Purpose Query Language (GQL). GRDB provides adequate protection facilities up to the relation level. The concept of “security matrix” has been made use of to provide database protection. The concept of Unique IDentification number (UID) and Password is made use of to ensure user identification and authentication. The concept of static integrity constraints has been used to ensure data integrity. Considerable efforts have been made to improve the response time through indexing on the data files and query optimisation. GRDB is designed for an interactive use but alternate provision has been made for its use through batch mode also. A typical Air Force application (consisting of data about personnel, inventory control, and maintenance planning) has been used to test GRDB and it has been found to perform satisfactorily.
Resumo:
Dynamic Bayesian Networks (DBNs) provide a versatile platform for predicting and analysing the behaviour of complex systems. As such, they are well suited to the prediction of complex ecosystem population trajectories under anthropogenic disturbances such as the dredging of marine seagrass ecosystems. However, DBNs assume a homogeneous Markov chain whereas a key characteristics of complex ecosystems is the presence of feedback loops, path dependencies and regime changes whereby the behaviour of the system can vary based on past states. This paper develops a method based on the small world structure of complex systems networks to modularise a non-homogeneous DBN and enable the computation of posterior marginal probabilities given evidence in forwards inference. It also provides an approach for an approximate solution for backwards inference as convergence is not guaranteed for a path dependent system. When applied to the seagrass dredging problem, the incorporation of path dependency can implement conditional absorption and allows release from the zero state in line with environmental and ecological observations. As dredging has a marked global impact on seagrass and other marine ecosystems of high environmental and economic value, using such a complex systems model to develop practical ways to meet the needs of conservation and industry through enhancing resistance and/or recovery is of paramount importance.
Resumo:
The increased accuracy in the cosmological observations, especially in the measurements of the comic microwave background, allow us to study the primordial perturbations in grater detail. In this thesis, we allow the possibility for a correlated isocurvature perturbations alongside the usual adiabatic perturbations. Thus far the simplest six parameter \Lambda CDM model has been able to accommodate all the observational data rather well. However, we find that the 3-year WMAP data and the 2006 Boomerang data favour a nonzero nonadiabatic contribution to the CMB angular power sprctrum. This is primordial isocurvature perturbation that is positively correlated with the primordial curvature perturbation. Compared with the adiabatic \Lambda CMD model we have four additional parameters describing the increased complexity if the primordial perturbations. Our best-fit model has a 4% nonadiabatic contribution to the CMB temperature variance and the fit is improved by \Delta\chi^2 = 9.7. We can attribute this preference for isocurvature to a feature in the peak structure of the angular power spectrum, namely, the widths of the second and third acoustic peak. Along the way, we have improved our analysis methods by identifying some issues with the parametrisation of the primordial perturbation spectra and suggesting ways to handle these. Due to the improvements, the convergence of our Markov chains is improved. The change of parametrisation has an effect on the MCMC analysis because of the change in priors. We have checked our results against this and find only marginal differences between our parametrisation.
Resumo:
H.264 video standard achieves high quality video along with high data compression when compared to other existing video standards. H.264 uses context-based adaptive variable length coding (CAVLC) to code residual data in Baseline profile. In this paper we describe a novel architecture for CAVLC decoder including coeff-token decoder, level decoder total-zeros decoder and run-before decoder UMC library in 0.13 mu CMOS technology is used to synthesize the proposed design. The proposed design reduces chip area and improves critical path performance of CAVLC decoder in comparison with [1]. Macroblock level (including luma and chroma) pipeline processing for CAVLC is implemented with an average of 141 cycles (including pipeline buffering) per macroblock at 250MHz clock frequency. To compare our results with [1] clock frequency is constrained to 125MHz. The area required for the proposed architecture is 17586 gates, which is 22.1% improvement in comparison to [1]. We obtain a throughput of 1.73 * 10(6) macroblocks/second, which is 28% higher than that reported in [1]. The proposed design meets the processing requirement of 1080HD [5] video at 30frames/seconds.
Resumo:
Several covalently linked bisporphyrin systems, free-base (H2P---H2P), hybrid bisporphyrins (Zn---H2P) and Zn(II) dimers (ZnP---ZnP) and their 1:1 molecular complexes with sym 1,3,5-trinitrobenzene have been investigated by optical absorption and emission, and magnetic resonance spectroscopic methods. In these systems, two porphyrin units are linked singly through one of the meso aryl groups via ether linkages of variable length. The bisporphyrins cooperatively bind a molecule of a ?-acceptor; 1,3,5-trinitrobenzene (TNB). The binding constant values vary with interchromophore separation. Maximum binding is observed in the bisporphyrin bearing a two-ether covalent linkage. It is found that TNB quenches the fluorescence of the two porphyrine units in a selective manner. It is suggested that a critical distance between the two porphyrin units is necessary for the observance of maximum cooperative intermolecular binding with an acceptor.
Resumo:
Mathematical modelling plays a vital role in the design, planning and operation of flexible manufacturing systems (FMSs). In this paper, attention is focused on stochastic modelling of FMSs using Markov chains, queueing networks, and stochastic Petri nets. We bring out the role of these modelling tools in FMS performance evaluation through several illustrative examples and provide a critical comparative evaluation. We also include a discussion on the modelling of deadlocks which constitute an important source of performance degradation in fully automated FMSs.
Resumo:
We present a new approach to spoken language modeling for language identification (LID) using the Lempel-Ziv-Welch (LZW) algorithm. The LZW technique is applicable to any kind of tokenization of the speech signal. Because of the efficiency of LZW algorithm to obtain variable length symbol strings in the training data, the LZW codebook captures the essentials of a language effectively. We develop two new deterministic measures for LID based on the LZW algorithm namely: (i) Compression ratio score (LZW-CR) and (ii) weighted discriminant score (LZW-WDS). To assess these measures, we consider error-free tokenization of speech as well as artificially induced noise in the tokenization. It is shown that for a 6 language LID task of OGI-TS database with clean tokenization, the new model (LZW-WDS) performs slightly better than the conventional bigram model. For noisy tokenization, which is the more realistic case, LZW-WDS significantly outperforms the bigram technique
Resumo:
Image segmentation is formulated as a stochastic process whose invariant distribution is concentrated at points of the desired region. By choosing multiple seed points, different regions can be segmented. The algorithm is based on the theory of time-homogeneous Markov chains and has been largely motivated by the technique of simulated annealing. The method proposed here has been found to perform well on real-world clean as well as noisy images while being computationally far less expensive than stochastic optimisation techniques
Resumo:
We study a State Dependent Attempt Rate (SDAR) approximation to model M queues (one queue per node) served by the Carrier Sense Multiple Access with Collision Avoidance (CSMA/CA) protocol as standardized in the IEEE 802.11 Distributed Coordination Function (DCF). The approximation is that, when n of the M queues are non-empty, the (transmission) attempt probability of each of the n non-empty nodes is given by the long-term (transmission) attempt probability of n saturated nodes. With the arrival of packets into the M queues according to independent Poisson processes, the SDAR approximation reduces a single cell with non-saturated nodes to a Markovian coupled queueing system. We provide a sufficient condition under which the joint queue length Markov chain is positive recurrent. For the symmetric case of equal arrival rates and finite and equal buffers, we develop an iterative method which leads to accurate predictions for important performance measures such as collision probability, throughput and mean packet delay. We replace the MAC layer with the SDAR model of contention by modifying the NS-2 source code pertaining to the MAC layer, keeping all other layers unchanged. By this model-based simulation technique at the MAC layer, we achieve speed-ups (w.r.t. MAC layer operations) up to 5.4. Through extensive model-based simulations and numerical results, we show that the SDAR model is an accurate model for the DCF MAC protocol in single cells. (C) 2012 Elsevier B.V. All rights reserved.
Resumo:
We consider the asymptotics of the invariant measure for the process of spatial distribution of N coupled Markov chains in the limit of a large number of chains. Each chain reflects the stochastic evolution of one particle. The chains are coupled through the dependence of transition rates on the spatial distribution of particles in the various states. Our model is a caricature for medium access interactions in wireless local area networks. Our model is also applicable in the study of spread of epidemics in a network. The limiting process satisfies a deterministic ordinary differential equation called the McKean-Vlasov equation. When this differential equation has a unique globally asymptotically stable equilibrium, the spatial distribution converges weakly to this equilibrium. Using a control-theoretic approach, we examine the question of a large deviation from this equilibrium.