12 resultados para Resampling

em CentAUR: Central Archive University of Reading - UK


Relevância:

10.00% 10.00%

Publicador:

Resumo:

This note considers the variance estimation for population size estimators based on capture–recapture experiments. Whereas a diversity of estimators of the population size has been suggested, the question of estimating the associated variances is less frequently addressed. This note points out that the technique of conditioning can be applied here successfully which also allows us to identify sources of variation: the variance due to estimation of the model parameters and the binomial variance due to sampling n units from a population of size N. It is applied to estimators typically used in capture–recapture experiments in continuous time including the estimators of Zelterman and Chao and improves upon previously used variance estimators. In addition, knowledge of the variances associated with the estimators by Zelterman and Chao allows the suggestion of a new estimator as the weighted sum of the two. The decomposition of the variance into the two sources allows also a new understanding of how resampling techniques like the Bootstrap could be used appropriately. Finally, the sample size question for capture–recapture experiments is addressed. Since the variance of population size estimators increases with the sample size, it is suggested to use relative measures such as the observed-to-hidden ratio or the completeness of identification proportion for approaching the question of sample size choice.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This note considers the variance estimation for population size estimators based on capture–recapture experiments. Whereas a diversity of estimators of the population size has been suggested, the question of estimating the associated variances is less frequently addressed. This note points out that the technique of conditioning can be applied here successfully which also allows us to identify sources of variation: the variance due to estimation of the model parameters and the binomial variance due to sampling n units from a population of size N. It is applied to estimators typically used in capture–recapture experiments in continuous time including the estimators of Zelterman and Chao and improves upon previously used variance estimators. In addition, knowledge of the variances associated with the estimators by Zelterman and Chao allows the suggestion of a new estimator as the weighted sum of the two. The decomposition of the variance into the two sources allows also a new understanding of how resampling techniques like the Bootstrap could be used appropriately. Finally, the sample size question for capture–recapture experiments is addressed. Since the variance of population size estimators increases with the sample size, it is suggested to use relative measures such as the observed-to-hidden ratio or the completeness of identification proportion for approaching the question of sample size choice.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

A novel approach is presented for the evaluation of circulation type classifications (CTCs) in terms of their capability to predict surface climate variations. The approach is analogous to that for probabilistic meteorological forecasts and is based on the Brier skill score. This score is shown to take a particularly simple form in the context of CTCs and to quantify the resolution of a climate variable by the classifications. The sampling uncertainty of the skill can be estimated by means of nonparametric bootstrap resampling. The evaluation approach is applied for a systematic intercomparison of 71 CTCs (objective and manual, from COST Action 733) with respect to their ability to resolve daily precipitation in the Alpine region. For essentially all CTCs, the Brier skill score is found to be higher for weak and moderate compared to intense precipitation, for winter compared to summer, and over the north and west of the Alps compared to the south and east. Moreover, CTCs with a higher number of types exhibit better skill than CTCs with few types. Among CTCs with comparable type number, the best automatic classifications are found to outperform the best manual classifications. It is not possible to single out one ‘best’ classification for Alpine precipitation, but there is a small group showing particularly high skill.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Ensemble clustering (EC) can arise in data assimilation with ensemble square root filters (EnSRFs) using non-linear models: an M-member ensemble splits into a single outlier and a cluster of M−1 members. The stochastic Ensemble Kalman Filter does not present this problem. Modifications to the EnSRFs by a periodic resampling of the ensemble through random rotations have been proposed to address it. We introduce a metric to quantify the presence of EC and present evidence to dispel the notion that EC leads to filter failure. Starting from a univariate model, we show that EC is not a permanent but transient phenomenon; it occurs intermittently in non-linear models. We perform a series of data assimilation experiments using a standard EnSRF and a modified EnSRF by a resampling though random rotations. The modified EnSRF thus alleviates issues associated with EC at the cost of traceability of individual ensemble trajectories and cannot use some of algorithms that enhance performance of standard EnSRF. In the non-linear regimes of low-dimensional models, the analysis root mean square error of the standard EnSRF slowly grows with ensemble size if the size is larger than the dimension of the model state. However, we do not observe this problem in a more complex model that uses an ensemble size much smaller than the dimension of the model state, along with inflation and localisation. Overall, we find that transient EC does not handicap the performance of the standard EnSRF.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Interest in the impacts of climate change is ever increasing. This is particularly true of the water sector where understanding potential changes in the occurrence of both floods and droughts is important for strategic planning. Climate variability has been shown to have a significant impact on UK climate and accounting for this in future climate cahgne projections is essential to fully anticipate potential future impacts. In this paper a new resampling methodology is developed which includes the variability of both baseline and future precipitation. The resampling methodology is applied to 13 CMIP3 climate models for the 2080s, resulting in an ensemble of monthly precipitation change factors. The change factors are applied to the Eden catchment in eastern Scotland with analysis undertaken for the sensitivity of future river flows to the changes in precipitation. Climate variability is shown to influence the magnitude and direction of change of both precipitation and in turn river flow, which are not apparent without the use of the resampling methodology. The transformation of precipitation changes to river flow changes display a degree of non-linearity due to the catchment's role in buffering the response. The resampling methodology developed in this paper provides a new technique for creating climate change scenarios which incorporate the important issue of climate variability.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

We present a benchmark system for global vegetation models. This system provides a quantitative evaluation of multiple simulated vegetation properties, including primary production; seasonal net ecosystem production; vegetation cover, composition and 5 height; fire regime; and runoff. The benchmarks are derived from remotely sensed gridded datasets and site-based observations. The datasets allow comparisons of annual average conditions and seasonal and inter-annual variability, and they allow the impact of spatial and temporal biases in means and variability to be assessed separately. Specifically designed metrics quantify model performance for each process, 10 and are compared to scores based on the temporal or spatial mean value of the observations and a “random” model produced by bootstrap resampling of the observations. The benchmark system is applied to three models: a simple light-use efficiency and water-balance model (the Simple Diagnostic Biosphere Model: SDBM), and the Lund-Potsdam-Jena (LPJ) and Land Processes and eXchanges (LPX) dynamic global 15 vegetation models (DGVMs). SDBM reproduces observed CO2 seasonal cycles, but its simulation of independent measurements of net primary production (NPP) is too high. The two DGVMs show little difference for most benchmarks (including the interannual variability in the growth rate and seasonal cycle of atmospheric CO2), but LPX represents burnt fraction demonstrably more accurately. Benchmarking also identified 20 several weaknesses common to both DGVMs. The benchmarking system provides a quantitative approach for evaluating how adequately processes are represented in a model, identifying errors and biases, tracking improvements in performance through model development, and discriminating among models. Adoption of such a system would do much to improve confidence in terrestrial model predictions of climate change 25 impacts and feedbacks.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The hybrid Monte Carlo (HMC) method is a popular and rigorous method for sampling from a canonical ensemble. The HMC method is based on classical molecular dynamics simulations combined with a Metropolis acceptance criterion and a momentum resampling step. While the HMC method completely resamples the momentum after each Monte Carlo step, the generalized hybrid Monte Carlo (GHMC) method can be implemented with a partial momentum refreshment step. This property seems desirable for keeping some of the dynamic information throughout the sampling process similar to stochastic Langevin and Brownian dynamics simulations. It is, however, ultimate to the success of the GHMC method that the rejection rate in the molecular dynamics part is kept at a minimum. Otherwise an undesirable Zitterbewegung in the Monte Carlo samples is observed. In this paper, we describe a method to achieve very low rejection rates by using a modified energy, which is preserved to high-order along molecular dynamics trajectories. The modified energy is based on backward error results for symplectic time-stepping methods. The proposed generalized shadow hybrid Monte Carlo (GSHMC) method is applicable to NVT as well as NPT ensemble simulations.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Many applications, such as intermittent data assimilation, lead to a recursive application of Bayesian inference within a Monte Carlo context. Popular data assimilation algorithms include sequential Monte Carlo methods and ensemble Kalman filters (EnKFs). These methods differ in the way Bayesian inference is implemented. Sequential Monte Carlo methods rely on importance sampling combined with a resampling step, while EnKFs utilize a linear transformation of Monte Carlo samples based on the classic Kalman filter. While EnKFs have proven to be quite robust even for small ensemble sizes, they are not consistent since their derivation relies on a linear regression ansatz. In this paper, we propose another transform method, which does not rely on any a priori assumptions on the underlying prior and posterior distributions. The new method is based on solving an optimal transportation problem for discrete random variables. © 2013, Society for Industrial and Applied Mathematics

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Particle filters are fully non-linear data assimilation techniques that aim to represent the probability distribution of the model state given the observations (the posterior) by a number of particles. In high-dimensional geophysical applications the number of particles required by the sequential importance resampling (SIR) particle filter in order to capture the high probability region of the posterior, is too large to make them usable. However particle filters can be formulated using proposal densities, which gives greater freedom in how particles are sampled and allows for a much smaller number of particles. Here a particle filter is presented which uses the proposal density to ensure that all particles end up in the high probability region of the posterior probability density function. This gives rise to the possibility of non-linear data assimilation in large dimensional systems. The particle filter formulation is compared to the optimal proposal density particle filter and the implicit particle filter, both of which also utilise a proposal density. We show that when observations are available every time step, both schemes will be degenerate when the number of independent observations is large, unlike the new scheme. The sensitivity of the new scheme to its parameter values is explored theoretically and demonstrated using the Lorenz (1963) model.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

We present a benchmark system for global vegetation models. This system provides a quantitative evaluation of multiple simulated vegetation properties, including primary production; seasonal net ecosystem production; vegetation cover; composition and height; fire regime; and runoff. The benchmarks are derived from remotely sensed gridded datasets and site-based observations. The datasets allow comparisons of annual average conditions and seasonal and inter-annual variability, and they allow the impact of spatial and temporal biases in means and variability to be assessed separately. Specifically designed metrics quantify model performance for each process, and are compared to scores based on the temporal or spatial mean value of the observations and a "random" model produced by bootstrap resampling of the observations. The benchmark system is applied to three models: a simple light-use efficiency and water-balance model (the Simple Diagnostic Biosphere Model: SDBM), the Lund-Potsdam-Jena (LPJ) and Land Processes and eXchanges (LPX) dynamic global vegetation models (DGVMs). In general, the SDBM performs better than either of the DGVMs. It reproduces independent measurements of net primary production (NPP) but underestimates the amplitude of the observed CO2 seasonal cycle. The two DGVMs show little difference for most benchmarks (including the inter-annual variability in the growth rate and seasonal cycle of atmospheric CO2), but LPX represents burnt fraction demonstrably more accurately. Benchmarking also identified several weaknesses common to both DGVMs. The benchmarking system provides a quantitative approach for evaluating how adequately processes are represented in a model, identifying errors and biases, tracking improvements in performance through model development, and discriminating among models. Adoption of such a system would do much to improve confidence in terrestrial model predictions of climate change impacts and feedbacks.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

A truly variance-minimizing filter is introduced and its per for mance is demonstrated with the Korteweg– DeV ries (KdV) equation and with a multilayer quasigeostrophic model of the ocean area around South Africa. It is recalled that Kalman-like filters are not variance minimizing for nonlinear model dynamics and that four - dimensional variational data assimilation (4DV AR)-like methods relying on per fect model dynamics have dif- ficulty with providing error estimates. The new method does not have these drawbacks. In fact, it combines advantages from both methods in that it does provide error estimates while automatically having balanced states after analysis, without extra computations. It is based on ensemble or Monte Carlo integrations to simulate the probability density of the model evolution. When obser vations are available, the so-called importance resampling algorithm is applied. From Bayes’ s theorem it follows that each ensemble member receives a new weight dependent on its ‘ ‘distance’ ’ t o the obser vations. Because the weights are strongly var ying, a resampling of the ensemble is necessar y. This resampling is done such that members with high weights are duplicated according to their weights, while low-weight members are largely ignored. In passing, it is noted that data assimilation is not an inverse problem by nature, although it can be for mulated that way . Also, it is shown that the posterior variance can be larger than the prior if the usual Gaussian framework is set aside. However , i n the examples presented here, the entropy of the probability densities is decreasing. The application to the ocean area around South Africa, gover ned by strongly nonlinear dynamics, shows that the method is working satisfactorily . The strong and weak points of the method are discussed and possible improvements are proposed.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Quantitative palaeoclimate reconstructions are widely used to evaluate climatemodel performance. Here, as part of an effort to provide such a data set for Australia, we examine the impact of analytical decisions and sampling assumptions on modern-analogue reconstructions using a continent-wide pollen data set. There is a high degree of correlation between temperature variables in the modern climate of Australia, but there is sufficient orthogonality in the variations of precipitation, summer and winter temperature and plant–available moisture to allow independent reconstructions of these four variables to be made. The method of analogue selection does not affect the reconstructions, although bootstrap resampling provides a more reliable technique for obtaining robust measures of uncertainty. The number of analogues used affects the quality of the reconstructions: the most robust reconstructions are obtained using 5 analogues. The quality of reconstructions based on post-1850 CE pollen samples differ little from those using samples from between 1450 and 1849 CE, showing that European post settlement modification of vegetation has no impact on the fidelity of the reconstructions although it substantially increases the availability of potential analogues. Reconstructions based on core top samples are more realistic than those using surface samples, but only using core top samples would substantially reduce the number of available analogues and therefore increases the uncertainty of the reconstructions. Spatial and/or temporal averaging of pollen assemblages prior to analysis negatively affects the subsequent reconstructions for some variables and increases the associated uncertainties. In addition, the quality of the reconstructions is affected by the degree of spatial smoothing of the original climate data, with the best reconstructions obtained using climate data froma 0.5° resolution grid, which corresponds to the typical size of the pollen catchment. This study provides a methodology that can be used to provide reliable palaeoclimate reconstructions for Australia, which will fill in a major gap in the data sets used to evaluate climate models.