926 resultados para Bayesian Mixture Model, Cavalieri Method, Trapezoidal Rule
Resumo:
The IntFOLD-TS method was developed according to the guiding principle that the model quality assessment would be the most critical stage for our template based modelling pipeline. Thus, the IntFOLD-TS method firstly generates numerous alternative models, using in-house versions of several different sequence-structure alignment methods, which are then ranked in terms of global quality using our top performing quality assessment method – ModFOLDclust2. In addition to the predicted global quality scores, the predictions of local errors are also provided in the resulting coordinate files, using scores that represent the predicted deviation of each residue in the model from the equivalent residue in the native structure. The IntFOLD-TS method was found to generate high quality 3D models for many of the CASP9 targets, whilst also providing highly accurate predictions of their per-residue errors. This important information may help to make the 3D models that are produced by the IntFOLD-TS method more useful for guiding future experimental work
Resumo:
Pollination is one of the most important ecosystem services in agroecosystems and supports food production. Pollinators are potentially at risk being exposed to pesticides and the main route of exposure is direct contact, in some cases ingestion, of contaminated materials such as pollen, nectar, flowers and foliage. To date there are no suitable methods for predicting pesticide exposure for pollinators, therefore official procedures to assess pesticide risk are based on a Hazard Quotient. Here we develop a procedure to assess exposure and risk for pollinators based on the foraging behaviour of honeybees (Apis mellifera) and using this species as indicator representative of pollinating insects. The method was applied in 13 European field sites with different climatic, landscape and land use characteristics. The level of risk during the crop growing season was evaluated as a function of the active ingredients used and application regime. Risk levels were primarily determined by the agronomic practices employed (i.e. crop type, pest control method, pesticide use), and there was a clear temporal partitioning of risks through time. Generally the risk was higher in sites cultivated with permanent crops, such as vineyard and olive, than in annual crops, such as cereals and oil seed rape. The greatest level of risk is generally found at the beginning of the growing season for annual crops and later in June–July for permanent crops.
Resumo:
The potential for spatial dependence in models of voter turnout, although plausible from a theoretical perspective, has not been adequately addressed in the literature. Using recent advances in Bayesian computation, we formulate and estimate the previously unutilized spatial Durbin error model and apply this model to the question of whether spillovers and unobserved spatial dependence in voter turnout matters from an empirical perspective. Formal Bayesian model comparison techniques are employed to compare the normal linear model, the spatially lagged X model (SLX), the spatial Durbin model, and the spatial Durbin error model. The results overwhelmingly support the spatial Durbin error model as the appropriate empirical model.
Resumo:
Top Down Induction of Decision Trees (TDIDT) is the most commonly used method of constructing a model from a dataset in the form of classification rules to classify previously unseen data. Alternative algorithms have been developed such as the Prism algorithm. Prism constructs modular rules which produce qualitatively better rules than rules induced by TDIDT. However, along with the increasing size of databases, many existing rule learning algorithms have proved to be computational expensive on large datasets. To tackle the problem of scalability, parallel classification rule induction algorithms have been introduced. As TDIDT is the most popular classifier, even though there are strongly competitive alternative algorithms, most parallel approaches to inducing classification rules are based on TDIDT. In this paper we describe work on a distributed classifier that induces classification rules in a parallel manner based on Prism.
Resumo:
We present a model of market participation in which the presence of non-negligible fixed costs leads to random censoring of the traditional double-hurdle model. Fixed costs arise when household resources must be devoted a priori to the decision to participate in the market. These costs, usually of time, are manifested in non-negligible minimum-efficient supplies and supply correspondence that requires modification of the traditional Tobit regression. The costs also complicate econometric estimation of household behavior. These complications are overcome by application of the Gibbs sampler. The algorithm thus derived provides robust estimates of the fixed-costs, double-hurdle model. The model and procedures are demonstrated in an application to milk market participation in the Ethiopian highlands.
Resumo:
We generalize the popular ensemble Kalman filter to an ensemble transform filter, in which the prior distribution can take the form of a Gaussian mixture or a Gaussian kernel density estimator. The design of the filter is based on a continuous formulation of the Bayesian filter analysis step. We call the new filter algorithm the ensemble Gaussian-mixture filter (EGMF). The EGMF is implemented for three simple test problems (Brownian dynamics in one dimension, Langevin dynamics in two dimensions and the three-dimensional Lorenz-63 model). It is demonstrated that the EGMF is capable of tracking systems with non-Gaussian uni- and multimodal ensemble distributions. Copyright © 2011 Royal Meteorological Society
Resumo:
Many applications, such as intermittent data assimilation, lead to a recursive application of Bayesian inference within a Monte Carlo context. Popular data assimilation algorithms include sequential Monte Carlo methods and ensemble Kalman filters (EnKFs). These methods differ in the way Bayesian inference is implemented. Sequential Monte Carlo methods rely on importance sampling combined with a resampling step, while EnKFs utilize a linear transformation of Monte Carlo samples based on the classic Kalman filter. While EnKFs have proven to be quite robust even for small ensemble sizes, they are not consistent since their derivation relies on a linear regression ansatz. In this paper, we propose another transform method, which does not rely on any a priori assumptions on the underlying prior and posterior distributions. The new method is based on solving an optimal transportation problem for discrete random variables. © 2013, Society for Industrial and Applied Mathematics
Resumo:
The multicomponent nonideal gas lattice Boltzmann model by Shan and Chen (S-C) is used to study the immiscible displacement in a sinusoidal tube. The movement of interface and the contact point (contact line in three-dimension) is studied. Due to the roughness of the boundary, the contact point shows "stick-slip" mechanics. The "stick-slip" effect decreases as the speed of the interface increases. For fluids that are nonwetting, the interface is almost perpendicular to the boundaries at most time, although its shapes at different position of the tube are rather different. When the tube becomes narrow, the interface turns a complex curves rather than remains simple menisci. The velocity is found to vary considerably between the neighbor nodes close to the contact point, consistent with the experimental observation that the velocity is multi-values on the contact line. Finally, the effect of three boundary conditions is discussed. The average speed is found different for different boundary conditions. The simple bounce-back rule makes the contact point move fastest. Both the simple bounce-back and the no-slip bounce-back rules are more sensitive to the roughness of the boundary in comparison with the half-way bounce-back rule. The simulation results suggest that the S-C model may be a promising tool in simulating the displacement behaviour of two immiscible fluids in complex geometry.
Resumo:
This paper details a strategy for modifying the source code of a complex model so that the model may be used in a data assimilation context, {and gives the standards for implementing a data assimilation code to use such a model}. The strategy relies on keeping the model separate from any data assimilation code, and coupling the two through the use of Message Passing Interface (MPI) {functionality}. This strategy limits the changes necessary to the model and as such is rapid to program, at the expense of ultimate performance. The implementation technique is applied in different models with state dimension up to $2.7 \times 10^8$. The overheads added by using this implementation strategy in a coupled ocean-atmosphere climate model are shown to be an order of magnitude smaller than the addition of correlated stochastic random errors necessary for some nonlinear data assimilation techniques.
Resumo:
Models for which the likelihood function can be evaluated only up to a parameter-dependent unknown normalizing constant, such as Markov random field models, are used widely in computer science, statistical physics, spatial statistics, and network analysis. However, Bayesian analysis of these models using standard Monte Carlo methods is not possible due to the intractability of their likelihood functions. Several methods that permit exact, or close to exact, simulation from the posterior distribution have recently been developed. However, estimating the evidence and Bayes’ factors for these models remains challenging in general. This paper describes new random weight importance sampling and sequential Monte Carlo methods for estimating BFs that use simulation to circumvent the evaluation of the intractable likelihood, and compares them to existing methods. In some cases we observe an advantage in the use of biased weight estimates. An initial investigation into the theoretical and empirical properties of this class of methods is presented. Some support for the use of biased estimates is presented, but we advocate caution in the use of such estimates.
Resumo:
Land cover data derived from satellites are commonly used to prescribe inputs to models of the land surface. Since such data inevitably contains errors, quantifying how uncertainties in the data affect a model’s output is important. To do so, a spatial distribution of possible land cover values is required to propagate through the model’s simulation. However, at large scales, such as those required for climate models, such spatial modelling can be difficult. Also, computer models often require land cover proportions at sites larger than the original map scale as inputs, and it is the uncertainty in these proportions that this article discusses. This paper describes a Monte Carlo sampling scheme that generates realisations of land cover proportions from the posterior distribution as implied by a Bayesian analysis that combines spatial information in the land cover map and its associated confusion matrix. The technique is computationally simple and has been applied previously to the Land Cover Map 2000 for the region of England and Wales. This article demonstrates the ability of the technique to scale up to large (global) satellite derived land cover maps and reports its application to the GlobCover 2009 data product. The results show that, in general, the GlobCover data possesses only small biases, with the largest belonging to non–vegetated surfaces. In vegetated surfaces, the most prominent area of uncertainty is Southern Africa, which represents a complex heterogeneous landscape. It is also clear from this study that greater resources need to be devoted to the construction of comprehensive confusion matrices.