129 resultados para Bayesian model selection


Relevância:

30.00% 30.00%

Publicador:

Resumo:

Motivation: Modelling the 3D structures of proteins can often be enhanced if more than one fold template is used during the modelling process. However, in many cases, this may also result in poorer model quality for a given target or alignment method. There is a need for modelling protocols that can both consistently and significantly improve 3D models and provide an indication of when models might not benefit from the use of multiple target-template alignments. Here, we investigate the use of both global and local model quality prediction scores produced by ModFOLDclust2, to improve the selection of target-template alignments for the construction of multiple-template models. Additionally, we evaluate clustering the resulting population of multi- and single-template models for the improvement of our IntFOLD-TS tertiary structure prediction method. Results: We find that using accurate local model quality scores to guide alignment selection is the most consistent way to significantly improve models for each of the sequence to structure alignment methods tested. In addition, using accurate global model quality for re-ranking alignments, prior to selection, further improves the majority of multi-template modelling methods tested. Furthermore, subsequent clustering of the resulting population of multiple-template models significantly improves the quality of selected models compared with the previous version of our tertiary structure prediction method, IntFOLD-TS.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Flood extents caused by fluvial floods in urban and rural areas may be predicted by hydraulic models. Assimilation may be used to correct the model state and improve the estimates of the model parameters or external forcing. One common observation assimilated is the water level at various points along the modelled reach. Distributed water levels may be estimated indirectly along the flood extents in Synthetic Aperture Radar (SAR) images by intersecting the extents with the floodplain topography. It is necessary to select a subset of levels for assimilation because adjacent levels along the flood extent will be strongly correlated. A method for selecting such a subset automatically and in near real-time is described, which would allow the SAR water levels to be used in a forecasting model. The method first selects candidate waterline points in flooded rural areas having low slope. The waterline levels and positions are corrected for the effects of double reflections between the water surface and emergent vegetation at the flood edge. Waterline points are also selected in flooded urban areas away from radar shadow and layover caused by buildings, with levels similar to those in adjacent rural areas. The resulting points are thinned to reduce spatial autocorrelation using a top-down clustering approach. The method was developed using a TerraSAR-X image from a particular case study involving urban and rural flooding. The waterline points extracted proved to be spatially uncorrelated, with levels reasonably similar to those determined manually from aerial photographs, and in good agreement with those of nearby gauges.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

As in any technology systems, analysis and design issues are among the fundamental challenges in persuasive technology. Currently, the Persuasive Systems Development (PSD) framework is considered to be the most comprehensive framework for designing and evaluation of persuasive systems. However, the framework is limited in terms of providing detailed information which can lead to selection of appropriate techniques depending on the variable nature of users or use over time. In light of this, we propose a model which is intended for analysing and implementing behavioural change in persuasive technology called the 3D-RAB model. The 3D-RAB model represents the three dimensional relationships between attitude towards behaviour, attitude towards change or maintaining a change, and current behaviour, and distinguishes variable levels in a user’s cognitive state. As such it provides a framework which could be used to select appropriate techniques for persuasive technology.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Statistical methods of inference typically require the likelihood function to be computable in a reasonable amount of time. The class of “likelihood-free” methods termed Approximate Bayesian Computation (ABC) is able to eliminate this requirement, replacing the evaluation of the likelihood with simulation from it. Likelihood-free methods have gained in efficiency and popularity in the past few years, following their integration with Markov Chain Monte Carlo (MCMC) and Sequential Monte Carlo (SMC) in order to better explore the parameter space. They have been applied primarily to estimating the parameters of a given model, but can also be used to compare models. Here we present novel likelihood-free approaches to model comparison, based upon the independent estimation of the evidence of each model under study. Key advantages of these approaches over previous techniques are that they allow the exploitation of MCMC or SMC algorithms for exploring the parameter space, and that they do not require a sampler able to mix between models. We validate the proposed methods using a simple exponential family problem before providing a realistic problem from human population genetics: the comparison of different demographic models based upon genetic data from the Y chromosome.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Undirected graphical models are widely used in statistics, physics and machine vision. However Bayesian parameter estimation for undirected models is extremely challenging, since evaluation of the posterior typically involves the calculation of an intractable normalising constant. This problem has received much attention, but very little of this has focussed on the important practical case where the data consists of noisy or incomplete observations of the underlying hidden structure. This paper specifically addresses this problem, comparing two alternative methodologies. In the first of these approaches particle Markov chain Monte Carlo (Andrieu et al., 2010) is used to efficiently explore the parameter space, combined with the exchange algorithm (Murray et al., 2006) for avoiding the calculation of the intractable normalising constant (a proof showing that this combination targets the correct distribution in found in a supplementary appendix online). This approach is compared with approximate Bayesian computation (Pritchard et al., 1999). Applications to estimating the parameters of Ising models and exponential random graphs from noisy data are presented. Each algorithm used in the paper targets an approximation to the true posterior due to the use of MCMC to simulate from the latent graphical model, in lieu of being able to do this exactly in general. The supplementary appendix also describes the nature of the resulting approximation.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This study examines differences in net selling price for residential real estate across male and female agents. A sample of 2,020 home sales transactions from Fulton County, Georgia are analyzed in a two-stage least squares, geospatial autoregressive corrected, semi-log hedonic model to test for gender and gender selection effects. Although agent gender seems to play a role in naïve models, its role becomes inconclusive as variables controlling for possible price and time on market expectations of the buyers and sellers are introduced to the models. Clear differences in real estate sales prices, time on market, and agent incomes across genders are unlikely due to differences in negotiation performance between genders or the mix of genders in a two-agent negotiation. The evidence suggests an interesting alternative to agent performance: that buyers and sellers with different reservation price and time on market expectations, such as those selling foreclosure homes, tend to select agents along gender lines.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This letter presents an effective approach for selection of appropriate terrain modeling methods in forming a digital elevation model (DEM). This approach achieves a balance between modeling accuracy and modeling speed. A terrain complexity index is defined to represent a terrain's complexity. A support vector machine (SVM) classifies terrain surfaces into either complex or moderate based on this index associated with the terrain elevation range. The classification result recommends a terrain modeling method for a given data set in accordance with its required modeling accuracy. Sample terrain data from the lunar surface are used in constructing an experimental data set. The results have shown that the terrain complexity index properly reflects the terrain complexity, and the SVM classifier derived from both the terrain complexity index and the terrain elevation range is more effective and generic than that designed from either the terrain complexity index or the terrain elevation range only. The statistical results have shown that the average classification accuracy of SVMs is about 84.3% ± 0.9% for terrain types (complex or moderate). For various ratios of complex and moderate terrain types in a selected data set, the DEM modeling speed increases up to 19.5% with given DEM accuracy.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Ensemble-based data assimilation is rapidly proving itself as a computationally-efficient and skilful assimilation method for numerical weather prediction, which can provide a viable alternative to more established variational assimilation techniques. However, a fundamental shortcoming of ensemble techniques is that the resulting analysis increments can only span a limited subspace of the state space, whose dimension is less than the ensemble size. This limits the amount of observational information that can effectively constrain the analysis. In this paper, a data selection strategy that aims to assimilate only the observational components that matter most and that can be used with both stochastic and deterministic ensemble filters is presented. This avoids unnecessary computations, reduces round-off errors and minimizes the risk of importing observation bias in the analysis. When an ensemble-based assimilation technique is used to assimilate high-density observations, the data-selection procedure allows the use of larger localization domains that may lead to a more balanced analysis. Results from the use of this data selection technique with a two-dimensional linear and a nonlinear advection model using both in situ and remote sounding observations are discussed.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Approximate Bayesian computation (ABC) methods make use of comparisons between simulated and observed summary statistics to overcome the problem of computationally intractable likelihood functions. As the practical implementation of ABC requires computations based on vectors of summary statistics, rather than full data sets, a central question is how to derive low-dimensional summary statistics from the observed data with minimal loss of information. In this article we provide a comprehensive review and comparison of the performance of the principal methods of dimension reduction proposed in the ABC literature. The methods are split into three nonmutually exclusive classes consisting of best subset selection methods, projection techniques and regularization. In addition, we introduce two new methods of dimension reduction. The first is a best subset selection method based on Akaike and Bayesian information criteria, and the second uses ridge regression as a regularization procedure. We illustrate the performance of these dimension reduction techniques through the analysis of three challenging models and data sets.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The use of Bayesian inference in the inference of time-frequency representations has, thus far, been limited to offline analysis of signals, using a smoothing spline based model of the time-frequency plane. In this paper we introduce a new framework that allows the routine use of Bayesian inference for online estimation of the time-varying spectral density of a locally stationary Gaussian process. The core of our approach is the use of a likelihood inspired by a local Whittle approximation. This choice, along with the use of a recursive algorithm for non-parametric estimation of the local spectral density, permits the use of a particle filter for estimating the time-varying spectral density online. We provide demonstrations of the algorithm through tracking chirps and the analysis of musical data.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The character of settlement patterns within the late Mesolithic communities of north-west Europe is a topic of substantial debate. An important case study concerns the five shell middens on the island of Oronsay, Inner Hebrides, western Scotland. Two conflicting interpretations have been proposed: the evidence from seasonality indicators and stable isotope analysis of human bones has been used to support a model of year-round settlement on this small island; alternatively, the middens have been interpreted as resulting from short-term intermittent visits to Oronsay within a regionally mobile settlement pattern. We contribute to this debate by describing Storakaig, a newly discovered site on the nearby island of Islay, undertaking a Bayesian chronological analysis and providing evidence for technological continuity between Oronsay and sites elsewhere in the region. While this new evidence remains open to alternative interpretation, we suggest that it makes regional mobility rather than year-round settlement on Oronsay a more viable interpretation for the Oronsay middens. Our analysis also confirms the likely overlap of the late Mesolithic with the earliest Neolithic within western Scotland.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Bayesian analysis is given of an instrumental variable model that allows for heteroscedasticity in both the structural equation and the instrument equation. Specifically, the approach for dealing with heteroscedastic errors in Geweke (1993) is extended to the Bayesian instrumental variable estimator outlined in Rossi et al. (2005). Heteroscedasticity is treated by modelling the variance for each error using a hierarchical prior that is Gamma distributed. The computation is carried out by using a Markov chain Monte Carlo sampling algorithm with an augmented draw for the heteroscedastic case. An example using real data illustrates the approach and shows that ignoring heteroscedasticity in the instrument equation when it exists may lead to biased estimates.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

tWe develop an orthogonal forward selection (OFS) approach to construct radial basis function (RBF)network classifiers for two-class problems. Our approach integrates several concepts in probabilisticmodelling, including cross validation, mutual information and Bayesian hyperparameter fitting. At eachstage of the OFS procedure, one model term is selected by maximising the leave-one-out mutual infor-mation (LOOMI) between the classifier’s predicted class labels and the true class labels. We derive theformula of LOOMI within the OFS framework so that the LOOMI can be evaluated efficiently for modelterm selection. Furthermore, a Bayesian procedure of hyperparameter fitting is also integrated into theeach stage of the OFS to infer the l2-norm based local regularisation parameter from the data. Since eachforward stage is effectively fitting of a one-variable model, this task is very fast. The classifier construc-tion procedure is automatically terminated without the need of using additional stopping criterion toyield very sparse RBF classifiers with excellent classification generalisation performance, which is par-ticular useful for the noisy data sets with highly overlapping class distribution. A number of benchmarkexamples are employed to demonstrate the effectiveness of our proposed approach.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We consider the forecasting of macroeconomic variables that are subject to revisions, using Bayesian vintage-based vector autoregressions. The prior incorporates the belief that, after the first few data releases, subsequent ones are likely to consist of revisions that are largely unpredictable. The Bayesian approach allows the joint modelling of the data revisions of more than one variable, while keeping the concomitant increase in parameter estimation uncertainty manageable. Our model provides markedly more accurate forecasts of post-revision values of inflation than do other models in the literature.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Various studies have indicated a relationship between enteric methane (CH4) production and milk fatty acid (FA) profiles of dairy cattle. However, the number of studies investigating such a relationship is limited and the direct relationships reported are mainly obtained by variation in CH4 production and milk FA concentration induced by dietary lipid supplements. The aim of this study was to perform a meta-analysis to quantify relationships between CH4 yield (per unit of feed and unit of milk) and milk FA profile in dairy cattle and to develop equations to predict CH4 yield based on milk FA profile of cows fed a wide variety of diets. Data from 8 experiments encompassing 30 different dietary treatments and 146 observations were included. Yield of CH4 measured in these experiments was 21.5 ± 2.46 g/kg of dry matter intake (DMI) and 13.9 ± 2.30 g/ kg of fat- and protein-corrected milk (FPCM). Correlation coefficients were chosen as effect size of the relationship between CH4 yield and individual milk FA concentration (g/100 g of FA). Average true correlation coefficients were estimated by a random-effects model. Milk FA concentrations of C6:0, C8:0, C10:0, C16:0, and C16:0-iso were significantly or tended to be positively related to CH4 yield per unit of feed. Concentrations of trans-6+7+8+9 C18:1, trans-10+11 C18:1, cis- 11 C18:1, cis-12 C18:1, cis-13 C18:1, trans-16+cis-14 C18:1, and cis-9,12 C18:2 in milk fat were significantly or tended to be negatively related to CH4 yield per unit of feed. Milk FA concentrations of C10:0, C12:0, C14:0-iso, C14:0, cis-9 C14:1, C15:0, and C16:0 were significantly or tended to be positively related to CH4 yield per unit of milk. Concentrations of C4:0, C18:0, trans-10+11 C18:1, cis-9 C18:1, cis-11 C18:1, and cis- 9,12 C18:2 in milk fat were significantly or tended to be negatively related to CH4 yield per unit of milk. Mixed model multiple regression and a stepwise selection procedure of milk FA based on the Bayesian information criterion to predict CH4 yield with milk FA as input (g/100 g of FA) resulted in the following prediction equations: CH4 (g/kg of DMI) = 23.39 + 9.74 × C16:0- iso – 1.06 × trans-10+11 C18:1 – 1.75 × cis-9,12 C18:2 (R2 = 0.54), and CH4 (g/kg of FPCM) = 21.13 – 1.38 × C4:0 + 8.53 × C16:0-iso – 0.22 × cis-9 C18:1 – 0.59 × trans-10+11 C18:1 (R2 = 0.47). This indicated that milk FA profile has a moderate potential for predicting CH4 yield per unit of feed and a slightly lower potential for predicting CH4 yield per unit of milk. Key words: methane , milk fatty acid profile , metaanalysis , dairy cattle