11 resultados para STOCHASTIC PROCESSES

em Duke University


Relevância:

70.00% 70.00%

Publicador:

Resumo:

Continuing our development of a mathematical theory of stochastic microlensing, we study the random shear and expected number of random lensed images of different types. In particular, we characterize the first three leading terms in the asymptotic expression of the joint probability density function (pdf) of the random shear tensor due to point masses in the limit of an infinite number of stars. Up to this order, the pdf depends on the magnitude of the shear tensor, the optical depth, and the mean number of stars through a combination of radial position and the star's mass. As a consequence, the pdf's of the shear components are seen to converge, in the limit of an infinite number of stars, to shifted Cauchy distributions, which shows that the shear components have heavy tails in that limit. The asymptotic pdf of the shear magnitude in the limit of an infinite number of stars is also presented. All the results on the random microlensing shear are given for a general point in the lens plane. Extending to the general random distributions (not necessarily uniform) of the lenses, we employ the Kac-Rice formula and Morse theory to deduce general formulas for the expected total number of images and the expected number of saddle images. We further generalize these results by considering random sources defined on a countable compact covering of the light source plane. This is done to introduce the notion of global expected number of positive parity images due to a general lensing map. Applying the result to microlensing, we calculate the asymptotic global expected number of minimum images in the limit of an infinite number of stars, where the stars are uniformly distributed. This global expectation is bounded, while the global expected number of images and the global expected number of saddle images diverge as the order of the number of stars. © 2009 American Institute of Physics.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

The paper investigates stochastic processes forced by independent and identically distributed jumps occurring according to a Poisson process. The impact of different distributions of the jump amplitudes are analyzed for processes with linear drift. Exact expressions of the probability density functions are derived when jump amplitudes are distributed as exponential, gamma, and mixture of exponential distributions for both natural and reflecting boundary conditions. The mean level-crossing properties are studied in relation to the different jump amplitudes. As an example of application of the previous theoretical derivations, the role of different rainfall-depth distributions on an existing stochastic soil water balance model is analyzed. It is shown how the shape of distribution of daily rainfall depths plays a more relevant role on the soil moisture probability distribution as the rainfall frequency decreases, as predicted by future climatic scenarios. © 2010 The American Physical Society.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

We develop a model for stochastic processes with random marginal distributions. Our model relies on a stick-breaking construction for the marginal distribution of the process, and introduces dependence across locations by using a latent Gaussian copula model as the mechanism for selecting the atoms. The resulting latent stick-breaking process (LaSBP) induces a random partition of the index space, with points closer in space having a higher probability of being in the same cluster. We develop an efficient and straightforward Markov chain Monte Carlo (MCMC) algorithm for computation and discuss applications in financial econometrics and ecology. This article has supplementary material online.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

The transition of the mammalian cell from quiescence to proliferation is a highly variable process. Over the last four decades, two lines of apparently contradictory, phenomenological models have been proposed to account for such temporal variability. These include various forms of the transition probability (TP) model and the growth control (GC) model, which lack mechanistic details. The GC model was further proposed as an alternative explanation for the concept of the restriction point, which we recently demonstrated as being controlled by a bistable Rb-E2F switch. Here, through a combination of modeling and experiments, we show that these different lines of models in essence reflect different aspects of stochastic dynamics in cell cycle entry. In particular, we show that the variable activation of E2F can be described by stochastic activation of the bistable Rb-E2F switch, which in turn may account for the temporal variability in cell cycle entry. Moreover, we show that temporal dynamics of E2F activation can be recast into the frameworks of both the TP model and the GC model via parameter mapping. This mapping suggests that the two lines of phenomenological models can be reconciled through the stochastic dynamics of the Rb-E2F switch. It also suggests a potential utility of the TP or GC models in defining concise, quantitative phenotypes of cell physiology. This may have implications in classifying cell types or states.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

We present a theory of hypoellipticity and unique ergodicity for semilinear parabolic stochastic PDEs with "polynomial" nonlinearities and additive noise, considered as abstract evolution equations in some Hilbert space. It is shown that if Hörmander's bracket condition holds at every point of this Hilbert space, then a lower bound on the Malliavin covariance operatorμt can be obtained. Informally, this bound can be read as "Fix any finite-dimensional projection on a subspace of sufficiently regular functions. Then the eigenfunctions of μt with small eigenvalues have only a very small component in the image of Π." We also show how to use a priori bounds on the solutions to the equation to obtain good control on the dependency of the bounds on the Malliavin matrix on the initial condition. These bounds are sufficient in many cases to obtain the asymptotic strong Feller property introduced in [HM06]. One of the main novel technical tools is an almost sure bound from below on the size of "Wiener polynomials," where the coefficients are possibly non-adapted stochastic processes satisfying a Lips chitz condition. By exploiting the polynomial structure of the equations, this result can be used to replace Norris' lemma, which is unavailable in the present context. We conclude by showing that the two-dimensional stochastic Navier-Stokes equations and a large class of reaction-diffusion equations fit the framework of our theory.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

While molecular and cellular processes are often modeled as stochastic processes, such as Brownian motion, chemical reaction networks and gene regulatory networks, there are few attempts to program a molecular-scale process to physically implement stochastic processes. DNA has been used as a substrate for programming molecular interactions, but its applications are restricted to deterministic functions and unfavorable properties such as slow processing, thermal annealing, aqueous solvents and difficult readout limit them to proof-of-concept purposes. To date, whether there exists a molecular process that can be programmed to implement stochastic processes for practical applications remains unknown.

In this dissertation, a fully specified Resonance Energy Transfer (RET) network between chromophores is accurately fabricated via DNA self-assembly, and the exciton dynamics in the RET network physically implement a stochastic process, specifically a continuous-time Markov chain (CTMC), which has a direct mapping to the physical geometry of the chromophore network. Excited by a light source, a RET network generates random samples in the temporal domain in the form of fluorescence photons which can be detected by a photon detector. The intrinsic sampling distribution of a RET network is derived as a phase-type distribution configured by its CTMC model. The conclusion is that the exciton dynamics in a RET network implement a general and important class of stochastic processes that can be directly and accurately programmed and used for practical applications of photonics and optoelectronics. Different approaches to using RET networks exist with vast potential applications. As an entropy source that can directly generate samples from virtually arbitrary distributions, RET networks can benefit applications that rely on generating random samples such as 1) fluorescent taggants and 2) stochastic computing.

By using RET networks between chromophores to implement fluorescent taggants with temporally coded signatures, the taggant design is not constrained by resolvable dyes and has a significantly larger coding capacity than spectrally or lifetime coded fluorescent taggants. Meanwhile, the taggant detection process becomes highly efficient, and the Maximum Likelihood Estimation (MLE) based taggant identification guarantees high accuracy even with only a few hundred detected photons.

Meanwhile, RET-based sampling units (RSU) can be constructed to accelerate probabilistic algorithms for wide applications in machine learning and data analytics. Because probabilistic algorithms often rely on iteratively sampling from parameterized distributions, they can be inefficient in practice on the deterministic hardware traditional computers use, especially for high-dimensional and complex problems. As an efficient universal sampling unit, the proposed RSU can be integrated into a processor / GPU as specialized functional units or organized as a discrete accelerator to bring substantial speedups and power savings.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

In a stochastic environment, long-term fitness can be influenced by variation, covariation, and serial correlation in vital rates (survival and fertility). Yet no study of an animal population has parsed the contributions of these three aspects of variability to long-term fitness. We do so using a unique database that includes complete life-history information for wild-living individuals of seven primate species that have been the subjects of long-term (22-45 years) behavioral studies. Overall, the estimated levels of vital rate variation had only minor effects on long-term fitness, and the effects of vital rate covariation and serial correlation were even weaker. To explore why, we compared estimated variances of adult survival in primates with values for other vertebrates in the literature and found that adult survival is significantly less variable in primates than it is in the other vertebrates. Finally, we tested the prediction that adult survival, because it more strongly influences fitness in a constant environment, will be less variable than newborn survival, and we found only mixed support for the prediction. Our results suggest that wild primates may be buffered against detrimental fitness effects of environmental stochasticity by their highly developed cognitive abilities, social networks, and broad, flexible diets.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Clearance of anogenital and oropharyngeal HPV infections is attributed primarily to a successful adaptive immune response. To date, little attention has been paid to the potential role of stochastic cell dynamics in the time it takes to clear an HPV infection. In this study, we combine mechanistic mathematical models at the cellular level with epidemiological data at the population level to disentangle the respective roles of immune capacity and cell dynamics in the clearing mechanism. Our results suggest that chance-in form of the stochastic dynamics of basal stem cells-plays a critical role in the elimination of HPV-infected cell clones. In particular, we find that in immunocompetent adolescents with cervical HPV infections, the immune response may contribute less than 20% to virus clearance-the rest is taken care of by the stochastic proliferation dynamics in the basal layer. In HIV-negative individuals, the contribution of the immune response may be negligible.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Genetic oscillators, such as circadian clocks, are constantly perturbed by molecular noise arising from the small number of molecules involved in gene regulation. One of the strongest sources of stochasticity is the binary noise that arises from the binding of a regulatory protein to a promoter in the chromosomal DNA. In this study, we focus on two minimal oscillators based on activator titration and repressor titration to understand the key parameters that are important for oscillations and for overcoming binary noise. We show that the rate of unbinding from the DNA, despite traditionally being considered a fast parameter, needs to be slow to broaden the space of oscillatory solutions. The addition of multiple, independent DNA binding sites further expands the oscillatory parameter space for the repressor-titration oscillator and lengthens the period of both oscillators. This effect is a combination of increased effective delay of the unbinding kinetics due to multiple binding sites and increased promoter ultrasensitivity that is specific for repression. We then use stochastic simulation to show that multiple binding sites increase the coherence of oscillations by mitigating the binary noise. Slow values of DNA unbinding rate are also effective in alleviating molecular noise due to the increased distance from the bifurcation point. Our work demonstrates how the number of DNA binding sites and slow unbinding kinetics, which are often omitted in biophysical models of gene circuits, can have a significant impact on the temporal and stochastic dynamics of genetic oscillators.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Many modern applications fall into the category of "large-scale" statistical problems, in which both the number of observations n and the number of features or parameters p may be large. Many existing methods focus on point estimation, despite the continued relevance of uncertainty quantification in the sciences, where the number of parameters to estimate often exceeds the sample size, despite huge increases in the value of n typically seen in many fields. Thus, the tendency in some areas of industry to dispense with traditional statistical analysis on the basis that "n=all" is of little relevance outside of certain narrow applications. The main result of the Big Data revolution in most fields has instead been to make computation much harder without reducing the importance of uncertainty quantification. Bayesian methods excel at uncertainty quantification, but often scale poorly relative to alternatives. This conflict between the statistical advantages of Bayesian procedures and their substantial computational disadvantages is perhaps the greatest challenge facing modern Bayesian statistics, and is the primary motivation for the work presented here.

Two general strategies for scaling Bayesian inference are considered. The first is the development of methods that lend themselves to faster computation, and the second is design and characterization of computational algorithms that scale better in n or p. In the first instance, the focus is on joint inference outside of the standard problem of multivariate continuous data that has been a major focus of previous theoretical work in this area. In the second area, we pursue strategies for improving the speed of Markov chain Monte Carlo algorithms, and characterizing their performance in large-scale settings. Throughout, the focus is on rigorous theoretical evaluation combined with empirical demonstrations of performance and concordance with the theory.

One topic we consider is modeling the joint distribution of multivariate categorical data, often summarized in a contingency table. Contingency table analysis routinely relies on log-linear models, with latent structure analysis providing a common alternative. Latent structure models lead to a reduced rank tensor factorization of the probability mass function for multivariate categorical data, while log-linear models achieve dimensionality reduction through sparsity. Little is known about the relationship between these notions of dimensionality reduction in the two paradigms. In Chapter 2, we derive several results relating the support of a log-linear model to nonnegative ranks of the associated probability tensor. Motivated by these findings, we propose a new collapsed Tucker class of tensor decompositions, which bridge existing PARAFAC and Tucker decompositions, providing a more flexible framework for parsimoniously characterizing multivariate categorical data. Taking a Bayesian approach to inference, we illustrate empirical advantages of the new decompositions.

Latent class models for the joint distribution of multivariate categorical, such as the PARAFAC decomposition, data play an important role in the analysis of population structure. In this context, the number of latent classes is interpreted as the number of genetically distinct subpopulations of an organism, an important factor in the analysis of evolutionary processes and conservation status. Existing methods focus on point estimates of the number of subpopulations, and lack robust uncertainty quantification. Moreover, whether the number of latent classes in these models is even an identified parameter is an open question. In Chapter 3, we show that when the model is properly specified, the correct number of subpopulations can be recovered almost surely. We then propose an alternative method for estimating the number of latent subpopulations that provides good quantification of uncertainty, and provide a simple procedure for verifying that the proposed method is consistent for the number of subpopulations. The performance of the model in estimating the number of subpopulations and other common population structure inference problems is assessed in simulations and a real data application.

In contingency table analysis, sparse data is frequently encountered for even modest numbers of variables, resulting in non-existence of maximum likelihood estimates. A common solution is to obtain regularized estimates of the parameters of a log-linear model. Bayesian methods provide a coherent approach to regularization, but are often computationally intensive. Conjugate priors ease computational demands, but the conjugate Diaconis--Ylvisaker priors for the parameters of log-linear models do not give rise to closed form credible regions, complicating posterior inference. In Chapter 4 we derive the optimal Gaussian approximation to the posterior for log-linear models with Diaconis--Ylvisaker priors, and provide convergence rate and finite-sample bounds for the Kullback-Leibler divergence between the exact posterior and the optimal Gaussian approximation. We demonstrate empirically in simulations and a real data application that the approximation is highly accurate, even in relatively small samples. The proposed approximation provides a computationally scalable and principled approach to regularized estimation and approximate Bayesian inference for log-linear models.

Another challenging and somewhat non-standard joint modeling problem is inference on tail dependence in stochastic processes. In applications where extreme dependence is of interest, data are almost always time-indexed. Existing methods for inference and modeling in this setting often cluster extreme events or choose window sizes with the goal of preserving temporal information. In Chapter 5, we propose an alternative paradigm for inference on tail dependence in stochastic processes with arbitrary temporal dependence structure in the extremes, based on the idea that the information on strength of tail dependence and the temporal structure in this dependence are both encoded in waiting times between exceedances of high thresholds. We construct a class of time-indexed stochastic processes with tail dependence obtained by endowing the support points in de Haan's spectral representation of max-stable processes with velocities and lifetimes. We extend Smith's model to these max-stable velocity processes and obtain the distribution of waiting times between extreme events at multiple locations. Motivated by this result, a new definition of tail dependence is proposed that is a function of the distribution of waiting times between threshold exceedances, and an inferential framework is constructed for estimating the strength of extremal dependence and quantifying uncertainty in this paradigm. The method is applied to climatological, financial, and electrophysiology data.

The remainder of this thesis focuses on posterior computation by Markov chain Monte Carlo. The Markov Chain Monte Carlo method is the dominant paradigm for posterior computation in Bayesian analysis. It has long been common to control computation time by making approximations to the Markov transition kernel. Comparatively little attention has been paid to convergence and estimation error in these approximating Markov Chains. In Chapter 6, we propose a framework for assessing when to use approximations in MCMC algorithms, and how much error in the transition kernel should be tolerated to obtain optimal estimation performance with respect to a specified loss function and computational budget. The results require only ergodicity of the exact kernel and control of the kernel approximation accuracy. The theoretical framework is applied to approximations based on random subsets of data, low-rank approximations of Gaussian processes, and a novel approximating Markov chain for discrete mixture models.

Data augmentation Gibbs samplers are arguably the most popular class of algorithm for approximately sampling from the posterior distribution for the parameters of generalized linear models. The truncated Normal and Polya-Gamma data augmentation samplers are standard examples for probit and logit links, respectively. Motivated by an important problem in quantitative advertising, in Chapter 7 we consider the application of these algorithms to modeling rare events. We show that when the sample size is large but the observed number of successes is small, these data augmentation samplers mix very slowly, with a spectral gap that converges to zero at a rate at least proportional to the reciprocal of the square root of the sample size up to a log factor. In simulation studies, moderate sample sizes result in high autocorrelations and small effective sample sizes. Similar empirical results are observed for related data augmentation samplers for multinomial logit and probit models. When applied to a real quantitative advertising dataset, the data augmentation samplers mix very poorly. Conversely, Hamiltonian Monte Carlo and a type of independence chain Metropolis algorithm show good mixing on the same dataset.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Increasing atmospheric carbon dioxide (CO2) from anthropogenic sources is acidifying marine environments resulting in potentially dramatic consequences for the physical, chemical and biological functioning of these ecosystems. If current trends continue, mean ocean pH is expected to decrease by ~0.2 units over the next ~50 years. Yet, there is also substantial temporal variability in pH and other carbon system parameters in the ocean resulting in regions that already experience change that exceeds long-term projected trends in pH. This points to short-term dynamics as an important layer of complexity on top of long-term trends. Thus, in order to predict future climate change impacts, there is a critical need to characterize the natural range and dynamics of the marine carbonate system and the mechanisms responsible for observed variability. Here, we present pH and dissolved inorganic carbon (DIC) at time intervals spanning 1 hour to >1 year from a dynamic, coastal, temperate marine system (Beaufort Inlet, Beaufort NC USA) to characterize the carbonate system at multiple time scales. Daily and seasonal variation of the carbonate system is largely driven by temperature, alkalinity and the balance between primary production and respiration, but high frequency change (hours to days) is further influenced by water mass movement (e.g. tides) and stochastic events (e.g. storms). Both annual (~0.3 units) and diurnal (~0.1 units) variability in coastal ocean acidity are similar in magnitude to 50 year projections of ocean acidity associated with increasing atmospheric CO2. The environmental variables driving these changes highlight the importance of characterizing the complete carbonate system rather than just pH. Short-term dynamics of ocean carbon parameters may already exert significant pressure on some coastal marine ecosystems with implications for ecology, biogeochemistry and evolution and this shorter term variability layers additive effects and complexity, including extreme values, on top of long-term trends in ocean acidification.