75 resultados para Monte Carlo method.
Resumo:
This article introduces a new general method for genealogical inference that samples independent genealogical histories using importance sampling (IS) and then samples other parameters with Markov chain Monte Carlo (MCMC). It is then possible to more easily utilize the advantages of importance sampling in a fully Bayesian framework. The method is applied to the problem of estimating recent changes in effective population size from temporally spaced gene frequency data. The method gives the posterior distribution of effective population size at the time of the oldest sample and at the time of the most recent sample, assuming a model of exponential growth or decline during the interval. The effect of changes in number of alleles, number of loci, and sample size on the accuracy of the method is described using test simulations, and it is concluded that these have an approximately equivalent effect. The method is used on three example data sets and problems in interpreting the posterior densities are highlighted and discussed.
Resumo:
Analyses of high-density single-nucleotide polymorphism (SNP) data, such as genetic mapping and linkage disequilibrium (LD) studies, require phase-known haplotypes to allow for the correlation between tightly linked loci. However, current SNP genotyping technology cannot determine phase, which must be inferred statistically. In this paper, we present a new Bayesian Markov chain Monte Carlo (MCMC) algorithm for population haplotype frequency estimation, particulary in the context of LD assessment. The novel feature of the method is the incorporation of a log-linear prior model for population haplotype frequencies. We present simulations to suggest that 1) the log-linear prior model is more appropriate than the standard coalescent process in the presence of recombination (>0.02cM between adjacent loci), and 2) there is substantial inflation in measures of LD obtained by a "two-stage" approach to the analysis by treating the "best" haplotype configuration as correct, without regard to uncertainty in the recombination process. Genet Epidemiol 25:106-114, 2003. (C) 2003 Wiley-Liss, Inc.
Resumo:
Population subdivision complicates analysis of molecular variation. Even if neutrality is assumed, three evolutionary forces need to be considered: migration, mutation, and drift. Simplification can be achieved by assuming that the process of migration among and drift within subpopulations is occurring fast compared to Mutation and drift in the entire population. This allows a two-step approach in the analysis: (i) analysis of population subdivision and (ii) analysis of molecular variation in the migrant pool. We model population subdivision using an infinite island model, where we allow the migration/drift parameter Theta to vary among populations. Thus, central and peripheral populations can be differentiated. For inference of Theta, we use a coalescence approach, implemented via a Markov chain Monte Carlo (MCMC) integration method that allows estimation of allele frequencies in the migrant pool. The second step of this approach (analysis of molecular variation in the migrant pool) uses the estimated allele frequencies in the migrant pool for the study of molecular variation. We apply this method to a Drosophila ananassae sequence data set. We find little indication of isolation by distance, but large differences in the migration parameter among populations. The population as a whole seems to be expanding. A population from Bogor (Java, Indonesia) shows the highest variation and seems closest to the species center.
Resumo:
Quantum calculations of the ground vibrational state tunneling splitting of H-atom and D-atom transfer in malonaldehyde are performed on a full-dimensional ab initio potential energy surface (PES). The PES is a fit to 11 147 near basis-set-limit frozen-core CCSD(T) electronic energies. This surface properly describes the invariance of the potential with respect to all permutations of identical atoms. The saddle-point barrier for the H-atom transfer on the PES is 4.1 kcal/mol, in excellent agreement with the reported ab initio value. Model one-dimensional and "exact" full-dimensional calculations of the splitting for H- and D-atom transfer are done using this PES. The tunneling splittings in full dimensionality are calculated using the unbiased "fixed-node" diffusion Monte Carlo (DMC) method in Cartesian and saddle-point normal coordinates. The ground-state tunneling splitting is found to be 21.6 cm(-1) in Cartesian coordinates and 22.6 cm(-1) in normal coordinates, with an uncertainty of 2-3 cm(-1). This splitting is also calculated based on a model which makes use of the exact single-well zero-point energy (ZPE) obtained with the MULTIMODE code and DMC ZPE and this calculation gives a tunneling splitting of 21-22 cm(-1). The corresponding computed splittings for the D-atom transfer are 3.0, 3.1, and 2-3 cm(-1). These calculated tunneling splittings agree with each other to within less than the standard uncertainties obtained with the DMC method used, which are between 2 and 3 cm(-1), and agree well with the experimental values of 21.6 and 2.9 cm(-1) for the H and D transfer, respectively. (C) 2008 American Institute of Physics.
Resumo:
We have estimated the speed and direction of propagation of a number of Coronal Mass Ejections (CMEs) using single-spacecraft data from the STEREO Heliospheric Imager (HI) wide-field cameras. In general, these values are in good agreement with those predicted by Thernisien, Vourlidas, and Howard in Solar Phys. 256, 111 -aEuro parts per thousand 130 (2009) using a forward modelling method to fit CMEs imaged by the STEREO COR2 coronagraphs. The directions of the CMEs predicted by both techniques are in good agreement despite the fact that many of the CMEs under study travel in directions that cause them to fade rapidly in the HI images. The velocities estimated from both techniques are in general agreement although there are some interesting differences that may provide evidence for the influence of the ambient solar wind on the speed of CMEs. The majority of CMEs with a velocity estimated to be below 400 km s(-1) in the COR2 field of view have higher estimated velocities in the HI field of view, while, conversely, those with COR2 velocities estimated to be above 400 km s(-1) have lower estimated HI velocities. We interpret this as evidence for the deceleration of fast CMEs and the acceleration of slower CMEs by interaction with the ambient solar wind beyond the COR2 field of view. We also show that the uncertainties in our derived parameters are influenced by the range of elongations over which each CME can be tracked. In order to reduce the uncertainty in the predicted arrival time of a CME at 1 Astronomical Unit (AU) to within six hours, the CME needs to be tracked out to at least 30 degrees elongation. This is in good agreement with predictions of the accuracy of our technique based on Monte Carlo simulations.
Resumo:
We describe a Bayesian approach to analyzing multilocus genotype or haplotype data to assess departures from gametic (linkage) equilibrium. Our approach employs a Markov chain Monte Carlo (MCMC) algorithm to approximate the posterior probability distributions of disequilibrium parameters. The distributions are computed exactly in some simple settings. Among other advantages, posterior distributions can be presented visually, which allows the uncertainties in parameter estimates to be readily assessed. In addition, background knowledge can be incorporated, where available, to improve the precision of inferences. The method is illustrated by application to previously published datasets; implications for multilocus forensic match probabilities and for simple association-based gene mapping are also discussed.
Resumo:
This paper introduces a method for simulating multivariate samples that have exact means, covariances, skewness and kurtosis. We introduce a new class of rectangular orthogonal matrix which is fundamental to the methodology and we call these matrices L matrices. They may be deterministic, parametric or data specific in nature. The target moments determine the L matrix then infinitely many random samples with the same exact moments may be generated by multiplying the L matrix by arbitrary random orthogonal matrices. This methodology is thus termed “ROM simulation”. Considering certain elementary types of random orthogonal matrices we demonstrate that they generate samples with different characteristics. ROM simulation has applications to many problems that are resolved using standard Monte Carlo methods. But no parametric assumptions are required (unless parametric L matrices are used) so there is no sampling error caused by the discrete approximation of a continuous distribution, which is a major source of error in standard Monte Carlo simulations. For illustration, we apply ROM simulation to determine the value-at-risk of a stock portfolio.
Resumo:
Identifying a periodic time-series model from environmental records, without imposing the positivity of the growth rate, does not necessarily respect the time order of the data observations. Consequently, subsequent observations, sampled in the environmental archive, can be inversed on the time axis, resulting in a non-physical signal model. In this paper an optimization technique with linear constraints on the signal model parameters is proposed that prevents time inversions. The activation conditions for this constrained optimization are based upon the physical constraint of the growth rate, namely, that it cannot take values smaller than zero. The actual constraints are defined for polynomials and first-order splines as basis functions for the nonlinear contribution in the distance-time relationship. The method is compared with an existing method that eliminates the time inversions, and its noise sensitivity is tested by means of Monte Carlo simulations. Finally, the usefulness of the method is demonstrated on the measurements of the vessel density, in a mangrove tree, Rhizophora mucronata, and the measurement of Mg/Ca ratios, in a bivalve, Mytilus trossulus.
Resumo:
A direct method is presented for determining the uncertainty in reservoir pressure, flow, and net present value (NPV) using the time-dependent, one phase, two- or three-dimensional equations of flow through a porous medium. The uncertainty in the solution is modelled as a probability distribution function and is computed from given statistical data for input parameters such as permeability. The method generates an expansion for the mean of the pressure about a deterministic solution to the system equations using a perturbation to the mean of the input parameters. Hierarchical equations that define approximations to the mean solution at each point and to the field covariance of the pressure are developed and solved numerically. The procedure is then used to find the statistics of the flow and the risked value of the field, defined by the NPV, for a given development scenario. This method involves only one (albeit complicated) solution of the equations and contrasts with the more usual Monte-Carlo approach where many such solutions are required. The procedure is applied easily to other physical systems modelled by linear or nonlinear partial differential equations with uncertain data.
Resumo:
The steadily accumulating literature on technical efficiency in fisheries attests to the importance of efficiency as an indicator of fleet condition and as an object of management concern. In this paper, we extend previous work by presenting a Bayesian hierarchical approach that yields both efficiency estimates and, as a byproduct of the estimation algorithm, probabilistic rankings of the relative technical efficiencies of fishing boats. The estimation algorithm is based on recent advances in Markov Chain Monte Carlo (MCMC) methods— Gibbs sampling, in particular—which have not been widely used in fisheries economics. We apply the method to a sample of 10,865 boat trips in the US Pacific hake (or whiting) fishery during 1987–2003. We uncover systematic differences between efficiency rankings based on sample mean efficiency estimates and those that exploit the full posterior distributions of boat efficiencies to estimate the probability that a given boat has the highest true mean efficiency.
Resumo:
The nonlinearity of high-power amplifiers (HPAs) has a crucial effect on the performance of multiple-input-multiple-output (MIMO) systems. In this paper, we investigate the performance of MIMO orthogonal space-time block coding (OSTBC) systems in the presence of nonlinear HPAs. Specifically, we propose a constellation-based compensation method for HPA nonlinearity in the case with knowledge of the HPA parameters at the transmitter and receiver, where the constellation and decision regions of the distorted transmitted signal are derived in advance. Furthermore, in the scenario without knowledge of the HPA parameters, a sequential Monte Carlo (SMC)-based compensation method for the HPA nonlinearity is proposed, which first estimates the channel-gain matrix by means of the SMC method and then uses the SMC-based algorithm to detect the desired signal. The performance of the MIMO-OSTBC system under study is evaluated in terms of average symbol error probability (SEP), total degradation (TD) and system capacity, in uncorrelated Nakagami-m fading channels. Numerical and simulation results are provided and show the effects on performance of several system parameters, such as the parameters of the HPA model, output back-off (OBO) of nonlinear HPA, numbers of transmit and receive antennas, modulation order of quadrature amplitude modulation (QAM), and number of SMC samples. In particular, it is shown that the constellation-based compensation method can efficiently mitigate the effect of HPA nonlinearity with low complexity and that the SMC-based detection scheme is efficient to compensate for HPA nonlinearity in the case without knowledge of the HPA parameters.
Resumo:
Remotely sensed land cover maps are increasingly used as inputs into environmental simulation models whose outputs inform decisions and policy-making. Risks associated with these decisions are dependent on model output uncertainty, which is in turn affected by the uncertainty of land cover inputs. This article presents a method of quantifying the uncertainty that results from potential mis-classification in remotely sensed land cover maps. In addition to quantifying uncertainty in the classification of individual pixels in the map, we also address the important case where land cover maps have been upscaled to a coarser grid to suit the users’ needs and are reported as proportions of land cover type. The approach is Bayesian and incorporates several layers of modelling but is straightforward to implement. First, we incorporate data in the confusion matrix derived from an independent field survey, and discuss the appropriate way to model such data. Second, we account for spatial correlation in the true land cover map, using the remotely sensed map as a prior. Third, spatial correlation in the mis-classification characteristics is induced by modelling their variance. The result is that we are able to simulate posterior means and variances for individual sites and the entire map using a simple Monte Carlo algorithm. The method is applied to the Land Cover Map 2000 for the region of England and Wales, a map used as an input into a current dynamic carbon flux model.
Resumo:
Many key economic and financial series are bounded either by construction or through policy controls. Conventional unit root tests are potentially unreliable in the presence of bounds, since they tend to over-reject the null hypothesis of a unit root, even asymptotically. So far, very little work has been undertaken to develop unit root tests which can be applied to bounded time series. In this paper we address this gap in the literature by proposing unit root tests which are valid in the presence of bounds. We present new augmented Dickey–Fuller type tests as well as new versions of the modified ‘M’ tests developed by Ng and Perron [Ng, S., Perron, P., 2001. LAG length selection and the construction of unit root tests with good size and power. Econometrica 69, 1519–1554] and demonstrate how these tests, combined with a simulation-based method to retrieve the relevant critical values, make it possible to control size asymptotically. A Monte Carlo study suggests that the proposed tests perform well in finite samples. Moreover, the tests outperform the Phillips–Perron type tests originally proposed in Cavaliere [Cavaliere, G., 2005. Limited time series with a unit root. Econometric Theory 21, 907–945]. An illustrative application to U.S. interest rate data is provided
Resumo:
In the concluding paper of this tetralogy, we here use the different geomagnetic activity indices to reconstruct the near-Earth interplanetary magnetic field (IMF) and solar wind flow speed, as well as the open solar flux (OSF) from 1845 to the present day. The differences in how the various indices vary with near-Earth interplanetary parameters, which are here exploited to separate the effects of the IMF and solar wind speed, are shown to be statistically significant at the 93% level or above. Reconstructions are made using four combinations of different indices, compiled using different data and different algorithms, and the results are almost identical for all parameters. The correction to the aa index required is discussed by comparison with the Ap index from a more extensive network of mid-latitude stations. Data from the Helsinki magnetometer station is used to extend the aa index back to 1845 and the results confirmed by comparison with the nearby St Petersburg observatory. The optimum variations, using all available long-term geomagnetic indices, of the near-Earth IMF and solar wind speed, and of the open solar flux, are presented; all with ±2sigma� uncertainties computed using the Monte Carlo technique outlined in the earlier papers. The open solar flux variation derived is shown to be very similar indeed to that obtained using the method of Lockwood et al. (1999).
Resumo:
A truly variance-minimizing filter is introduced and its per for mance is demonstrated with the Korteweg– DeV ries (KdV) equation and with a multilayer quasigeostrophic model of the ocean area around South Africa. It is recalled that Kalman-like filters are not variance minimizing for nonlinear model dynamics and that four - dimensional variational data assimilation (4DV AR)-like methods relying on per fect model dynamics have dif- ficulty with providing error estimates. The new method does not have these drawbacks. In fact, it combines advantages from both methods in that it does provide error estimates while automatically having balanced states after analysis, without extra computations. It is based on ensemble or Monte Carlo integrations to simulate the probability density of the model evolution. When obser vations are available, the so-called importance resampling algorithm is applied. From Bayes’ s theorem it follows that each ensemble member receives a new weight dependent on its ‘ ‘distance’ ’ t o the obser vations. Because the weights are strongly var ying, a resampling of the ensemble is necessar y. This resampling is done such that members with high weights are duplicated according to their weights, while low-weight members are largely ignored. In passing, it is noted that data assimilation is not an inverse problem by nature, although it can be for mulated that way . Also, it is shown that the posterior variance can be larger than the prior if the usual Gaussian framework is set aside. However , i n the examples presented here, the entropy of the probability densities is decreasing. The application to the ocean area around South Africa, gover ned by strongly nonlinear dynamics, shows that the method is working satisfactorily . The strong and weak points of the method are discussed and possible improvements are proposed.