34 resultados para Random Walk Models


Relevância:

80.00% 80.00%

Publicador:

Resumo:

In this paper we develop set of novel Markov chain Monte Carlo algorithms for Bayesian smoothing of partially observed non-linear diffusion processes. The sampling algorithms developed herein use a deterministic approximation to the posterior distribution over paths as the proposal distribution for a mixture of an independence and a random walk sampler. The approximating distribution is sampled by simulating an optimized time-dependent linear diffusion process derived from the recently developed variational Gaussian process approximation method. Flexible blocking strategies are introduced to further improve mixing, and thus the efficiency, of the sampling algorithms. The algorithms are tested on two diffusion processes: one with double-well potential drift and another with SINE drift. The new algorithm's accuracy and efficiency is compared with state-of-the-art hybrid Monte Carlo based path sampling. It is shown that in practical, finite sample, applications the algorithm is accurate except in the presence of large observation errors and low observation densities, which lead to a multi-modal structure in the posterior distribution over paths. More importantly, the variational approximation assisted sampling algorithm outperforms hybrid Monte Carlo in terms of computational efficiency, except when the diffusion process is densely observed with small errors in which case both algorithms are equally efficient.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Understanding a complex network's structure holds the key to understanding its function. The physics community has contributed a multitude of methods and analyses to this cross-disciplinary endeavor. Structural features exist on both the microscopic level, resulting from differences between single node properties, and the mesoscopic level resulting from properties shared by groups of nodes. Disentangling the determinants of network structure on these different scales has remained a major, and so far unsolved, challenge. Here we show how multiscale generative probabilistic exponential random graph models combined with efficient, distributive message-passing inference techniques can be used to achieve this separation of scales, leading to improved detection accuracy of latent classes as demonstrated on benchmark problems. It sheds new light on the statistical significance of motif-distributions in neural networks and improves the link-prediction accuracy as exemplified for gene-disease associations in the highly consequential Online Mendelian Inheritance in Man database. © 2011 Reichardt et al.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The principled statistical application of Gaussian random field models used in geostatistics has historically been limited to data sets of a small size. This limitation is imposed by the requirement to store and invert the covariance matrix of all the samples to obtain a predictive distribution at unsampled locations, or to use likelihood-based covariance estimation. Various ad hoc approaches to solve this problem have been adopted, such as selecting a neighborhood region and/or a small number of observations to use in the kriging process, but these have no sound theoretical basis and it is unclear what information is being lost. In this article, we present a Bayesian method for estimating the posterior mean and covariance structures of a Gaussian random field using a sequential estimation algorithm. By imposing sparsity in a well-defined framework, the algorithm retains a subset of “basis vectors” that best represent the “true” posterior Gaussian random field model in the relative entropy sense. This allows a principled treatment of Gaussian random field models on very large data sets. The method is particularly appropriate when the Gaussian random field model is regarded as a latent variable model, which may be nonlinearly related to the observations. We show the application of the sequential, sparse Bayesian estimation in Gaussian random field models and discuss its merits and drawbacks.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

For analysing financial time series two main opposing viewpoints exist, either capital markets are completely stochastic and therefore prices follow a random walk, or they are deterministic and consequently predictable. For each of these views a great variety of tools exist with which it can be tried to confirm the hypotheses. Unfortunately, these methods are not well suited for dealing with data characterised in part by both paradigms. This thesis investigates these two approaches in order to model the behaviour of financial time series. In the deterministic framework methods are used to characterise the dimensionality of embedded financial data. The stochastic approach includes here an estimation of the unconditioned and conditional return distributions using parametric, non- and semi-parametric density estimation techniques. Finally, it will be shown how elements from these two approaches could be combined to achieve a more realistic model for financial time series.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

We obtain the exact asymptotic result for the disorder-averaged probability distribution function for a random walk in a biased Sinai model and show that it is characterized by a creeping behavior of the displacement moments with time, similar to v(mu n), where mu <1 is dimensionless mean drift. We employ a method originated in quantum diffusion which is based on the exact mapping of the problem to an imaginary-time Schrodinger equation. For nonzero drift such an equation has an isolated lowest eigenvalue separated by a gap from quasicontinuous excited states, and the eigenstate corresponding to the former governs the long-time asymptotic behavior.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Tests for random walk behaviour in the Italian stock market are presented, based on an investigation of the fractal properties of the log return series for the Mibtel index. The random walk hypothesis is evaluated against alternatives accommodating either unifractality or multifractality. Critical values for the test statistics are generated using Monte Carlo simulations of random Gaussian innovations. Evidence is reported of multifractality, and the departure from random walk behaviour is statistically significant on standard criteria. The observed pattern is attributed primarily to fat tails in the return probability distribution, associated with volatility clustering in returns measured over various time scales. © 2009 Elsevier Inc. All rights reserved.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In this paper, we investigate the use of manifold learning techniques to enhance the separation properties of standard graph kernels. The idea stems from the observation that when we perform multidimensional scaling on the distance matrices extracted from the kernels, the resulting data tends to be clustered along a curve that wraps around the embedding space, a behavior that suggests that long range distances are not estimated accurately, resulting in an increased curvature of the embedding space. Hence, we propose to use a number of manifold learning techniques to compute a low-dimensional embedding of the graphs in an attempt to unfold the embedding manifold, and increase the class separation. We perform an extensive experimental evaluation on a number of standard graph datasets using the shortest-path (Borgwardt and Kriegel, 2005), graphlet (Shervashidze et al., 2009), random walk (Kashima et al., 2003) and Weisfeiler-Lehman (Shervashidze et al., 2011) kernels. We observe the most significant improvement in the case of the graphlet kernel, which fits with the observation that neglecting the locational information of the substructures leads to a stronger curvature of the embedding manifold. On the other hand, the Weisfeiler-Lehman kernel partially mitigates the locality problem by using the node labels information, and thus does not clearly benefit from the manifold learning. Interestingly, our experiments also show that the unfolding of the space seems to reduce the performance gap between the examined kernels.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In this paper we develop set of novel Markov Chain Monte Carlo algorithms for Bayesian smoothing of partially observed non-linear diffusion processes. The sampling algorithms developed herein use a deterministic approximation to the posterior distribution over paths as the proposal distribution for a mixture of an independence and a random walk sampler. The approximating distribution is sampled by simulating an optimized time-dependent linear diffusion process derived from the recently developed variational Gaussian process approximation method. The novel diffusion bridge proposal derived from the variational approximation allows the use of a flexible blocking strategy that further improves mixing, and thus the efficiency, of the sampling algorithms. The algorithms are tested on two diffusion processes: one with double-well potential drift and another with SINE drift. The new algorithm's accuracy and efficiency is compared with state-of-the-art hybrid Monte Carlo based path sampling. It is shown that in practical, finite sample applications the algorithm is accurate except in the presence of large observation errors and low to a multi-modal structure in the posterior distribution over paths. More importantly, the variational approximation assisted sampling algorithm outperforms hybrid Monte Carlo in terms of computational efficiency, except when the diffusion process is densely observed with small errors in which case both algorithms are equally efficient. © 2011 Springer-Verlag.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The thesis presents a two-dimensional Risk Assessment Method (RAM) where the assessment of risk to the groundwater resources incorporates both the quantification of the probability of the occurrence of contaminant source terms, as well as the assessment of the resultant impacts. The approach emphasizes the need for a greater dependency on the potential pollution sources, rather than the traditional approach where assessment is based mainly on the intrinsic geo-hydrologic parameters. The risk is calculated using Monte Carlo simulation methods whereby random pollution events were generated to the same distribution as historically occurring events or a priori potential probability distribution. Integrated mathematical models then simulate contaminant concentrations at the predefined monitoring points within the aquifer. The spatial and temporal distributions of the concentrations were calculated from repeated realisations, and the number of times when a user defined concentration magnitude was exceeded is quantified as a risk. The method was setup by integrating MODFLOW-2000, MT3DMS and a FORTRAN coded risk model, and automated, using a DOS batch processing file. GIS software was employed in producing the input files and for the presentation of the results. The functionalities of the method, as well as its sensitivities to the model grid sizes, contaminant loading rates, length of stress periods, and the historical frequencies of occurrence of pollution events were evaluated using hypothetical scenarios and a case study. Chloride-related pollution sources were compiled and used as indicative potential contaminant sources for the case study. At any active model cell, if a random generated number is less than the probability of pollution occurrence, then the risk model will generate synthetic contaminant source term as an input into the transport model. The results of the applications of the method are presented in the form of tables, graphs and spatial maps. Varying the model grid sizes indicates no significant effects on the simulated groundwater head. The simulated frequency of daily occurrence of pollution incidents is also independent of the model dimensions. However, the simulated total contaminant mass generated within the aquifer, and the associated volumetric numerical error appear to increase with the increasing grid sizes. Also, the migration of contaminant plume advances faster with the coarse grid sizes as compared to the finer grid sizes. The number of daily contaminant source terms generated and consequently the total mass of contaminant within the aquifer increases in a non linear proportion to the increasing frequency of occurrence of pollution events. The risk of pollution from a number of sources all occurring by chance together was evaluated, and quantitatively presented as risk maps. This capability to combine the risk to a groundwater feature from numerous potential sources of pollution proved to be a great asset to the method, and a large benefit over the contemporary risk and vulnerability methods.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Computer models, or simulators, are widely used in a range of scientific fields to aid understanding of the processes involved and make predictions. Such simulators are often computationally demanding and are thus not amenable to statistical analysis. Emulators provide a statistical approximation, or surrogate, for the simulators accounting for the additional approximation uncertainty. This thesis develops a novel sequential screening method to reduce the set of simulator variables considered during emulation. This screening method is shown to require fewer simulator evaluations than existing approaches. Utilising the lower dimensional active variable set simplifies subsequent emulation analysis. For random output, or stochastic, simulators the output dispersion, and thus variance, is typically a function of the inputs. This work extends the emulator framework to account for such heteroscedasticity by constructing two new heteroscedastic Gaussian process representations and proposes an experimental design technique to optimally learn the model parameters. The design criterion is an extension of Fisher information to heteroscedastic variance models. Replicated observations are efficiently handled in both the design and model inference stages. Through a series of simulation experiments on both synthetic and real world simulators, the emulators inferred on optimal designs with replicated observations are shown to outperform equivalent models inferred on space-filling replicate-free designs in terms of both model parameter uncertainty and predictive variance.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We introduce models of heterogeneous systems with finite connectivity defined on random graphs to capture finite-coordination effects on the low-temperature behaviour of finite-dimensional systems. Our models use a description in terms of small deviations of particle coordinates from a set of reference positions, particularly appropriate for the description of low-temperature phenomena. A Born-von Karman-type expansion with random coefficients is used to model effects of frozen heterogeneities. The key quantity appearing in the theoretical description is a full distribution of effective single-site potentials which needs to be determined self-consistently. If microscopic interactions are harmonic, the effective single-site potentials turn out to be harmonic as well, and the distribution of these single-site potentials is equivalent to a distribution of localization lengths used earlier in the description of chemical gels. For structural glasses characterized by frustration and anharmonicities in the microscopic interactions, the distribution of single-site potentials involves anharmonicities of all orders, and both single-well and double-well potentials are observed, the latter with a broad spectrum of barrier heights. The appearance of glassy phases at low temperatures is marked by the appearance of asymmetries in the distribution of single-site potentials, as previously observed for fully connected systems. Double-well potentials with a broad spectrum of barrier heights and asymmetries would give rise to the well-known universal glassy low-temperature anomalies when quantum effects are taken into account. © 2007 IOP Publishing Ltd.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Random Boolean formulae, generated by a growth process of noisy logical gates are analyzed using the generating functional methodology of statistical physics. We study the type of functions generated for different input distributions, their robustness for a given level of gate error and its dependence on the formulae depth and complexity and the gates used. Bounds on their performance, derived in the information theory literature for specific gates, are straightforwardly retrieved, generalized and identified as the corresponding typical-case phase transitions. Results for error-rates, function-depth and sensitivity of the generated functions are obtained for various gate-type and noise models. © 2010 IOP Publishing Ltd.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper we present a novel method for emulating a stochastic, or random output, computer model and show its application to a complex rabies model. The method is evaluated both in terms of accuracy and computational efficiency on synthetic data and the rabies model. We address the issue of experimental design and provide empirical evidence on the effectiveness of utilizing replicate model evaluations compared to a space-filling design. We employ the Mahalanobis error measure to validate the heteroscedastic Gaussian process based emulator predictions for both the mean and (co)variance. The emulator allows efficient screening to identify important model inputs and better understanding of the complex behaviour of the rabies model.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This thesis includes analysis of disordered spin ensembles corresponding to Exact Cover, a multi-access channel problem, and composite models combining sparse and dense interactions. The satisfiability problem in Exact Cover is addressed using a statistical analysis of a simple branch and bound algorithm. The algorithm can be formulated in the large system limit as a branching process, for which critical properties can be analysed. Far from the critical point a set of differential equations may be used to model the process, and these are solved by numerical integration and exact bounding methods. The multi-access channel problem is formulated as an equilibrium statistical physics problem for the case of bit transmission on a channel with power control and synchronisation. A sparse code division multiple access method is considered and the optimal detection properties are examined in typical case by use of the replica method, and compared to detection performance achieved by interactive decoding methods. These codes are found to have phenomena closely resembling the well-understood dense codes. The composite model is introduced as an abstraction of canonical sparse and dense disordered spin models. The model includes couplings due to both dense and sparse topologies simultaneously. The new type of codes are shown to outperform sparse and dense codes in some regimes both in optimal performance, and in performance achieved by iterative detection methods in finite systems.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We consider the random input problem for a nonlinear system modeled by the integrable one-dimensional self-focusing nonlinear Schrödinger equation (NLSE). We concentrate on the properties obtained from the direct scattering problem associated with the NLSE. We discuss some general issues regarding soliton creation from random input. We also study the averaged spectral density of random quasilinear waves generated in the NLSE channel for two models of the disordered input field profile. The first model is symmetric complex Gaussian white noise and the second one is a real dichotomous (telegraph) process. For the former model, the closed-form expression for the averaged spectral density is obtained, while for the dichotomous real input we present the small noise perturbative expansion for the same quantity. In the case of the dichotomous input, we also obtain the distribution of minimal pulse width required for a soliton generation. The obtained results can be applied to a multitude of problems including random nonlinear Fraunhoffer diffraction, transmission properties of randomly apodized long period Fiber Bragg gratings, and the propagation of incoherent pulses in optical fibers.