932 resultados para Markov chains. Convergence. Evolutionary Strategy. Large Deviations
Resumo:
We derive necessary and sufficient conditions for the existence of bounded or summable solutions to systems of linear equations associated with Markov chains. This substantially extends a famous result of G. E. H. Reuter, which provides a convenient means of checking various uniqueness criteria for birth-death processes. Our result allows chains with much more general transition structures to be accommodated. One application is to give a new proof of an important result of M. F. Chen concerning upwardly skip-free processes. We then use our generalization of Reuter's lemma to prove new results for downwardly skip-free chains, such as the Markov branching process and several of its many generalizations. This permits us to establish uniqueness criteria for several models, including the general birth, death, and catastrophe process, extended branching processes, and asymptotic birth-death processes, the latter being neither upwardly skip-free nor downwardly skip-free.
Resumo:
These lecture notes are devoted to present several uses of Large Deviation asymptotics in Branching Processes.
Resumo:
2000 Mathematics Subject Classification: 60J45, 60K25
Resumo:
We investigate a class of simple models for Langevin dynamics of turbulent flows, including the one-layer quasi-geostrophic equation and the two-dimensional Euler equations. Starting from a path integral representation of the transition probability, we compute the most probable fluctuation paths from one attractor to any state within its basin of attraction. We prove that such fluctuation paths are the time reversed trajectories of the relaxation paths for a corresponding dual dynamics, which are also within the framework of quasi-geostrophic Langevin dynamics. Cases with or without detailed balance are studied. We discuss a specific example for which the stationary measure displays either a second order (continuous) or a first order (discontinuous) phase transition and a tricritical point. In situations where a first order phase transition is observed, the dynamics are bistable. Then, the transition paths between two coexisting attractors are instantons (fluctuation paths from an attractor to a saddle), which are related to the relaxation paths of the corresponding dual dynamics. For this example, we show how one can analytically determine the instantons and compute the transition probabilities for rare transitions between two attractors.
Resumo:
A RET network consists of a network of photo-active molecules called chromophores that can participate in inter-molecular energy transfer called resonance energy transfer (RET). RET networks are used in a variety of applications including cryptographic devices, storage systems, light harvesting complexes, biological sensors, and molecular rulers. In this dissertation, we focus on creating a RET device called closed-diffusive exciton valve (C-DEV) in which the input to output transfer function is controlled by an external energy source, similar to a semiconductor transistor like the MOSFET. Due to their biocompatibility, molecular devices like the C-DEVs can be used to introduce computing power in biological, organic, and aqueous environments such as living cells. Furthermore, the underlying physics in RET devices are stochastic in nature, making them suitable for stochastic computing in which true random distribution generation is critical.
In order to determine a valid configuration of chromophores for the C-DEV, we developed a systematic process based on user-guided design space pruning techniques and built-in simulation tools. We show that our C-DEV is 15x better than C-DEVs designed using ad hoc methods that rely on limited data from prior experiments. We also show ways in which the C-DEV can be improved further and how different varieties of C-DEVs can be combined to form more complex logic circuits. Moreover, the systematic design process can be used to search for valid chromophore network configurations for a variety of RET applications.
We also describe a feasibility study for a technique used to control the orientation of chromophores attached to DNA. Being able to control the orientation can expand the design space for RET networks because it provides another parameter to tune their collective behavior. While results showed limited control over orientation, the analysis required the development of a mathematical model that can be used to determine the distribution of dipoles in a given sample of chromophore constructs. The model can be used to evaluate the feasibility of other potential orientation control techniques.
Resumo:
Markov Chain analysis was recently proposed to assess the time scales and preferential pathways into biological or physical networks by computing residence time, first passage time, rates of transfer between nodes and number of passages in a node. We propose to adapt an algorithm already published for simple systems to physical systems described with a high resolution hydrodynamic model. The method is applied to bays and estuaries on the Eastern Coast of Canada for their interest in shellfish aquaculture. Current velocities have been computed by using a 2 dimensional grid of elements and circulation patterns were summarized by averaging Eulerian flows between adjacent elements. Flows and volumes allow computing probabilities of transition between elements and to assess the average time needed by virtual particles to move from one element to another, the rate of transfer between two elements, and the average residence time of each system. We also combined transfer rates and times to assess the main pathways of virtual particles released in farmed areas and the potential influence of farmed areas on other areas. We suggest that Markov chain is complementary to other sets of ecological indicators proposed to analyse the interactions between farmed areas - e.g. depletion index, carrying capacity assessment. Markov Chain has several advantages with respect to the estimation of connectivity between pair of sites. It makes possible to estimate transfer rates and times at once in a very quick and efficient way, without the need to perform long term simulations of particle or tracer concentration.
Resumo:
Markov chain Monte Carlo (MCMC) is a methodology that is gaining widespread use in the phylogenetics community and is central to phylogenetic software packages such as MrBayes. An important issue for users of MCMC methods is how to select appropriate values for adjustable parameters such as the length of the Markov chain or chains, the sampling density, the proposal mechanism, and, if Metropolis-coupled MCMC is being used, the number of heated chains and their temperatures. Although some parameter settings have been examined in detail in the literature, others are frequently chosen with more regard to computational time or personal experience with other data sets. Such choices may lead to inadequate sampling of tree space or an inefficient use of computational resources. We performed a detailed study of convergence and mixing for 70 randomly selected, putatively orthologous protein sets with different sizes and taxonomic compositions. Replicated runs from multiple random starting points permit a more rigorous assessment of convergence, and we developed two novel statistics, delta and epsilon, for this purpose. Although likelihood values invariably stabilized quickly, adequate sampling of the posterior distribution of tree topologies took considerably longer. Our results suggest that multimodality is common for data sets with 30 or more taxa and that this results in slow convergence and mixing. However, we also found that the pragmatic approach of combining data from several short, replicated runs into a metachain to estimate bipartition posterior probabilities provided good approximations, and that such estimates were no worse in approximating a reference posterior distribution than those obtained using a single long run of the same length as the metachain. Precision appears to be best when heated Markov chains have low temperatures, whereas chains with high temperatures appear to sample trees with high posterior probabilities only rarely. [Bayesian phylogenetic inference; heating parameter; Markov chain Monte Carlo; replicated chains.]
Resumo:
Many modern applications fall into the category of "large-scale" statistical problems, in which both the number of observations n and the number of features or parameters p may be large. Many existing methods focus on point estimation, despite the continued relevance of uncertainty quantification in the sciences, where the number of parameters to estimate often exceeds the sample size, despite huge increases in the value of n typically seen in many fields. Thus, the tendency in some areas of industry to dispense with traditional statistical analysis on the basis that "n=all" is of little relevance outside of certain narrow applications. The main result of the Big Data revolution in most fields has instead been to make computation much harder without reducing the importance of uncertainty quantification. Bayesian methods excel at uncertainty quantification, but often scale poorly relative to alternatives. This conflict between the statistical advantages of Bayesian procedures and their substantial computational disadvantages is perhaps the greatest challenge facing modern Bayesian statistics, and is the primary motivation for the work presented here.
Two general strategies for scaling Bayesian inference are considered. The first is the development of methods that lend themselves to faster computation, and the second is design and characterization of computational algorithms that scale better in n or p. In the first instance, the focus is on joint inference outside of the standard problem of multivariate continuous data that has been a major focus of previous theoretical work in this area. In the second area, we pursue strategies for improving the speed of Markov chain Monte Carlo algorithms, and characterizing their performance in large-scale settings. Throughout, the focus is on rigorous theoretical evaluation combined with empirical demonstrations of performance and concordance with the theory.
One topic we consider is modeling the joint distribution of multivariate categorical data, often summarized in a contingency table. Contingency table analysis routinely relies on log-linear models, with latent structure analysis providing a common alternative. Latent structure models lead to a reduced rank tensor factorization of the probability mass function for multivariate categorical data, while log-linear models achieve dimensionality reduction through sparsity. Little is known about the relationship between these notions of dimensionality reduction in the two paradigms. In Chapter 2, we derive several results relating the support of a log-linear model to nonnegative ranks of the associated probability tensor. Motivated by these findings, we propose a new collapsed Tucker class of tensor decompositions, which bridge existing PARAFAC and Tucker decompositions, providing a more flexible framework for parsimoniously characterizing multivariate categorical data. Taking a Bayesian approach to inference, we illustrate empirical advantages of the new decompositions.
Latent class models for the joint distribution of multivariate categorical, such as the PARAFAC decomposition, data play an important role in the analysis of population structure. In this context, the number of latent classes is interpreted as the number of genetically distinct subpopulations of an organism, an important factor in the analysis of evolutionary processes and conservation status. Existing methods focus on point estimates of the number of subpopulations, and lack robust uncertainty quantification. Moreover, whether the number of latent classes in these models is even an identified parameter is an open question. In Chapter 3, we show that when the model is properly specified, the correct number of subpopulations can be recovered almost surely. We then propose an alternative method for estimating the number of latent subpopulations that provides good quantification of uncertainty, and provide a simple procedure for verifying that the proposed method is consistent for the number of subpopulations. The performance of the model in estimating the number of subpopulations and other common population structure inference problems is assessed in simulations and a real data application.
In contingency table analysis, sparse data is frequently encountered for even modest numbers of variables, resulting in non-existence of maximum likelihood estimates. A common solution is to obtain regularized estimates of the parameters of a log-linear model. Bayesian methods provide a coherent approach to regularization, but are often computationally intensive. Conjugate priors ease computational demands, but the conjugate Diaconis--Ylvisaker priors for the parameters of log-linear models do not give rise to closed form credible regions, complicating posterior inference. In Chapter 4 we derive the optimal Gaussian approximation to the posterior for log-linear models with Diaconis--Ylvisaker priors, and provide convergence rate and finite-sample bounds for the Kullback-Leibler divergence between the exact posterior and the optimal Gaussian approximation. We demonstrate empirically in simulations and a real data application that the approximation is highly accurate, even in relatively small samples. The proposed approximation provides a computationally scalable and principled approach to regularized estimation and approximate Bayesian inference for log-linear models.
Another challenging and somewhat non-standard joint modeling problem is inference on tail dependence in stochastic processes. In applications where extreme dependence is of interest, data are almost always time-indexed. Existing methods for inference and modeling in this setting often cluster extreme events or choose window sizes with the goal of preserving temporal information. In Chapter 5, we propose an alternative paradigm for inference on tail dependence in stochastic processes with arbitrary temporal dependence structure in the extremes, based on the idea that the information on strength of tail dependence and the temporal structure in this dependence are both encoded in waiting times between exceedances of high thresholds. We construct a class of time-indexed stochastic processes with tail dependence obtained by endowing the support points in de Haan's spectral representation of max-stable processes with velocities and lifetimes. We extend Smith's model to these max-stable velocity processes and obtain the distribution of waiting times between extreme events at multiple locations. Motivated by this result, a new definition of tail dependence is proposed that is a function of the distribution of waiting times between threshold exceedances, and an inferential framework is constructed for estimating the strength of extremal dependence and quantifying uncertainty in this paradigm. The method is applied to climatological, financial, and electrophysiology data.
The remainder of this thesis focuses on posterior computation by Markov chain Monte Carlo. The Markov Chain Monte Carlo method is the dominant paradigm for posterior computation in Bayesian analysis. It has long been common to control computation time by making approximations to the Markov transition kernel. Comparatively little attention has been paid to convergence and estimation error in these approximating Markov Chains. In Chapter 6, we propose a framework for assessing when to use approximations in MCMC algorithms, and how much error in the transition kernel should be tolerated to obtain optimal estimation performance with respect to a specified loss function and computational budget. The results require only ergodicity of the exact kernel and control of the kernel approximation accuracy. The theoretical framework is applied to approximations based on random subsets of data, low-rank approximations of Gaussian processes, and a novel approximating Markov chain for discrete mixture models.
Data augmentation Gibbs samplers are arguably the most popular class of algorithm for approximately sampling from the posterior distribution for the parameters of generalized linear models. The truncated Normal and Polya-Gamma data augmentation samplers are standard examples for probit and logit links, respectively. Motivated by an important problem in quantitative advertising, in Chapter 7 we consider the application of these algorithms to modeling rare events. We show that when the sample size is large but the observed number of successes is small, these data augmentation samplers mix very slowly, with a spectral gap that converges to zero at a rate at least proportional to the reciprocal of the square root of the sample size up to a log factor. In simulation studies, moderate sample sizes result in high autocorrelations and small effective sample sizes. Similar empirical results are observed for related data augmentation samplers for multinomial logit and probit models. When applied to a real quantitative advertising dataset, the data augmentation samplers mix very poorly. Conversely, Hamiltonian Monte Carlo and a type of independence chain Metropolis algorithm show good mixing on the same dataset.
Resumo:
INTRODUCTION: Hip fractures are responsible for excessive mortality, decreasing the 5-year survival rate by about 20%. From an economic perspective, they represent a major source of expense, with direct costs in hospitalization, rehabilitation, and institutionalization. The incidence rate sharply increases after the age of 70, but it can be reduced in women aged 70-80 years by therapeutic interventions. Recent analyses suggest that the most efficient strategy is to implement such interventions in women at the age of 70 years. As several guidelines recommend bone mineral density (BMD) screening of postmenopausal women with clinical risk factors, our objective was to assess the cost-effectiveness of two screening strategies applied to elderly women aged 70 years and older. METHODS: A cost-effectiveness analysis was performed using decision-tree analysis and a Markov model. Two alternative strategies, one measuring BMD of all women, and one measuring BMD only of those having at least one risk factor, were compared with the reference strategy "no screening". Cost-effectiveness ratios were measured as cost per year gained without hip fracture. Most probabilities were based on data observed in EPIDOS, SEMOF and OFELY cohorts. RESULTS: In this model, which is mostly based on observed data, the strategy "screen all" was more cost effective than "screen women at risk." For one woman screened at the age of 70 and followed for 10 years, the incremental (additional) cost-effectiveness ratio of these two strategies compared with the reference was 4,235 euros and 8,290 euros, respectively. CONCLUSION: The results of this model, under the assumptions described in the paper, suggest that in women aged 70-80 years, screening all women with dual-energy X-ray absorptiometry (DXA) would be more effective than no screening or screening only women with at least one risk factor. Cost-effectiveness studies based on decision-analysis trees maybe useful tools for helping decision makers, and further models based on different assumptions should be performed to improve the level of evidence on cost-effectiveness ratios of the usual screening strategies for osteoporosis.
Resumo:
Reliability and dependability modeling can be employed during many stages of analysis of a computing system to gain insights into its critical behaviors. To provide useful results, realistic models of systems are often necessarily large and complex. Numerical analysis of these models presents a formidable challenge because the sizes of their state-space descriptions grow exponentially in proportion to the sizes of the models. On the other hand, simulation of the models requires analysis of many trajectories in order to compute statistically correct solutions. This dissertation presents a novel framework for performing both numerical analysis and simulation. The new numerical approach computes bounds on the solutions of transient measures in large continuous-time Markov chains (CTMCs). It extends existing path-based and uniformization-based methods by identifying sets of paths that are equivalent with respect to a reward measure and related to one another via a simple structural relationship. This relationship makes it possible for the approach to explore multiple paths at the same time,· thus significantly increasing the number of paths that can be explored in a given amount of time. Furthermore, the use of a structured representation for the state space and the direct computation of the desired reward measure (without ever storing the solution vector) allow it to analyze very large models using a very small amount of storage. Often, path-based techniques must compute many paths to obtain tight bounds. In addition to presenting the basic path-based approach, we also present algorithms for computing more paths and tighter bounds quickly. One resulting approach is based on the concept of path composition whereby precomputed subpaths are composed to compute the whole paths efficiently. Another approach is based on selecting important paths (among a set of many paths) for evaluation. Many path-based techniques suffer from having to evaluate many (unimportant) paths. Evaluating the important ones helps to compute tight bounds efficiently and quickly.
Resumo:
This paper deals with the expected discounted continuous control of piecewise deterministic Markov processes (PDMP`s) using a singular perturbation approach for dealing with rapidly oscillating parameters. The state space of the PDMP is written as the product of a finite set and a subset of the Euclidean space a""e (n) . The discrete part of the state, called the regime, characterizes the mode of operation of the physical system under consideration, and is supposed to have a fast (associated to a small parameter epsilon > 0) and a slow behavior. By using a similar approach as developed in Yin and Zhang (Continuous-Time Markov Chains and Applications: A Singular Perturbation Approach, Applications of Mathematics, vol. 37, Springer, New York, 1998, Chaps. 1 and 3) the idea in this paper is to reduce the number of regimes by considering an averaged model in which the regimes within the same class are aggregated through the quasi-stationary distribution so that the different states in this class are replaced by a single one. The main goal is to show that the value function of the control problem for the system driven by the perturbed Markov chain converges to the value function of this limit control problem as epsilon goes to zero. This convergence is obtained by, roughly speaking, showing that the infimum and supremum limits of the value functions satisfy two optimality inequalities as epsilon goes to zero. This enables us to show the result by invoking a uniqueness argument, without needing any kind of Lipschitz continuity condition.
Resumo:
We prove that, once an algorithm of perfect simulation for a stationary and ergodic random field F taking values in S(Zd), S a bounded subset of R(n), is provided, the speed of convergence in the mean ergodic theorem occurs exponentially fast for F. Applications from (non-equilibrium) statistical mechanics and interacting particle systems are presented.
Resumo:
Among the largest resources for biological sequence data is the large amount of expressed sequence tags (ESTs) available in public and proprietary databases. ESTs provide information on transcripts but for technical reasons they often contain sequencing errors. Therefore, when analyzing EST sequences computationally, such errors must be taken into account. Earlier attempts to model error prone coding regions have shown good performance in detecting and predicting these while correcting sequencing errors using codon usage frequencies. In the research presented here, we improve the detection of translation start and stop sites by integrating a more complex mRNA model with codon usage bias based error correction into one hidden Markov model (HMM), thus generalizing this error correction approach to more complex HMMs. We show that our method maintains the performance in detecting coding sequences.
Resumo:
In plants, an oligogene family encodes NADP-malic enzymes (NADP-me), which are responsible for various functions and exhibit different kinetics and expression patterns. In particular, a chloroplast isoform of NADP-me plays a key role in one of the three biochemical subtypes of C4 photosynthesis, an adaptation to warm environments that evolved several times independently during angiosperm diversification. By combining genomic and phylogenetic approaches, this study aimed at identifying the molecular mechanisms linked to the recurrent evolutions of C4-specific NADP-me in grasses (Poaceae). Genes encoding NADP-me (nadpme) were retrieved from genomes of model grasses and isolated from a large sample of C3 and C4 grasses. Genomic and phylogenetic analyses showed that 1) the grass nadpme gene family is composed of four main lineages, one of which is expressed in plastids (nadpme-IV), 2) C4-specific NADP-me evolved at least five times independently from nadpme-IV, and 3) some codons driven by positive selection underwent parallel changes during the multiple C4 origins. The C4 NADP-me being expressed in chloroplasts probably constrained its recurrent evolutions from the only plastid nadpme lineage and this common starting point limited the number of evolutionary paths toward a C4 optimized enzyme, resulting in genetic convergence. In light of the history of nadpme genes, an evolutionary scenario of the C4 phenotype using NADP-me is discussed.