28 resultados para Probabilistic graphical model
em CentAUR: Central Archive University of Reading - UK
Resumo:
This article presents a statistical method for detecting recombination in DNA sequence alignments, which is based on combining two probabilistic graphical models: (1) a taxon graph (phylogenetic tree) representing the relationship between the taxa, and (2) a site graph (hidden Markov model) representing interactions between different sites in the DNA sequence alignments. We adopt a Bayesian approach and sample the parameters of the model from the posterior distribution with Markov chain Monte Carlo, using a Metropolis-Hastings and Gibbs-within-Gibbs scheme. The proposed method is tested on various synthetic and real-world DNA sequence alignments, and we compare its performance with the established detection methods RECPARS, PLATO, and TOPAL, as well as with two alternative parameter estimation schemes.
Resumo:
Undirected graphical models are widely used in statistics, physics and machine vision. However Bayesian parameter estimation for undirected models is extremely challenging, since evaluation of the posterior typically involves the calculation of an intractable normalising constant. This problem has received much attention, but very little of this has focussed on the important practical case where the data consists of noisy or incomplete observations of the underlying hidden structure. This paper specifically addresses this problem, comparing two alternative methodologies. In the first of these approaches particle Markov chain Monte Carlo (Andrieu et al., 2010) is used to efficiently explore the parameter space, combined with the exchange algorithm (Murray et al., 2006) for avoiding the calculation of the intractable normalising constant (a proof showing that this combination targets the correct distribution in found in a supplementary appendix online). This approach is compared with approximate Bayesian computation (Pritchard et al., 1999). Applications to estimating the parameters of Ising models and exponential random graphs from noisy data are presented. Each algorithm used in the paper targets an approximation to the true posterior due to the use of MCMC to simulate from the latent graphical model, in lieu of being able to do this exactly in general. The supplementary appendix also describes the nature of the resulting approximation.
Resumo:
What is the relationship between magnitude judgments relying on directly available characteristics versus probabilistic cues? Question frame was manipulated in a comparative judgment task previously assumed to involve inference across a probabilistic mental model (e.g., “which city is largest” – the “larger” question – versus “which city is smallest” – the “smaller” question). Participants identified either the largest or smallest city (Experiments 1a, 2) or the richest or poorest person (Experiment 1b) in a three-alternative forced choice (3-AFC) task (Experiment 1) or 2-AFC task (Experiment 2). Response times revealed an interaction between question frame and the number of options recognized. When asked the smaller question, response times were shorter when none of the options were recognized. The opposite pattern was found when asked the larger question: response time was shorter when all options were recognized. These task-stimuli congruity results in judgment under uncertainty are consistent with, and predicted by, theories of magnitude comparison which make use of deductive inferences from declarative knowledge.
Resumo:
This volume is a serious attempt to open up the subject of European philosophy of science to real thought, and provide the structural basis for the interdisciplinary development of its specialist fields, but also to provoke reflection on the idea of ‘European philosophy of science’. This efforts should foster a contemporaneous reflection on what might be meant by philosophy of science in Europe and European philosophy of science, and how in fact awareness of it could assist philosophers interpret and motivate their research through a stronger collective identity. The overarching aim is to set the background for a collaborative project organising, systematising, and ultimately forging an identity for, European philosophy of science by creating research structures and developing research networks across Europe to promote its development.
Resumo:
Climate model ensembles are widely heralded for their potential to quantify uncertainties and generate probabilistic climate projections. However, such technical improvements to modeling science will do little to deliver on their ultimate promise of improving climate policymaking and adaptation unless the insights they generate can be effectively communicated to decision makers. While some of these communicative challenges are unique to climate ensembles, others are common to hydrometeorological modeling more generally, and to the tensions arising between the imperatives for saliency, robustness, and richness in risk communication. The paper reviews emerging approaches to visualizing and communicating climate ensembles and compares them to the more established and thoroughly evaluated communication methods used in the numerical weather prediction domains of day-to-day weather forecasting (in particular probabilities of precipitation), hurricane and flood warning, and seasonal forecasting. This comparative analysis informs recommendations on best practice for climate modelers, as well as prompting some further thoughts on key research challenges to improve the future communication of climate change uncertainties.
Resumo:
We present a new parameterisation that relates surface mass balance (SMB: the sum of surface accumulation and surface ablation) to changes in surface elevation of the Greenland ice sheet (GrIS) for the MAR (Modèle Atmosphérique Régional: Fettweis, 2007) regional climate model. The motivation is to dynamically adjust SMB as the GrIS evolves, allowing us to force ice sheet models with SMB simulated by MAR while incorporating the SMB–elevation feedback, without the substantial technical challenges of coupling ice sheet and climate models. This also allows us to assess the effect of elevation feedback uncertainty on the GrIS contribution to sea level, using multiple global climate and ice sheet models, without the need for additional, expensive MAR simulations. We estimate this relationship separately below and above the equilibrium line altitude (ELA, separating negative and positive SMB) and for regions north and south of 77� N, from a set of MAR simulations in which we alter the ice sheet surface elevation. These give four “SMB lapse rates”, gradients that relate SMB changes to elevation changes. We assess uncertainties within a Bayesian framework, estimating probability distributions for each gradient from which we present best estimates and credibility intervals (CI) that bound 95% of the probability. Below the ELA our gradient estimates are mostly positive, because SMB usually increases with elevation: 0.56 (95% CI: −0.22 to 1.33) kgm−3 a−1 for the north, and 1.91 (1.03 to 2.61) kgm−3 a−1 for the south. Above the ELA, the gradients are much smaller in magnitude: 0.09 (−0.03 to 0.23) kgm−3 a−1 in the north, and 0.07 (−0.07 to 0.59) kgm−3 a−1 in the south, because SMB can either increase or decrease in response to increased elevation. Our statistically founded approach allows us to make probabilistic assessments for the effect of elevation feedback uncertainty on sea level projections (Edwards et al., 2014).
Resumo:
An ability to quantify the reliability of probabilistic flood inundation predictions is a requirement not only for guiding model development but also for their successful application. Probabilistic flood inundation predictions are usually produced by choosing a method of weighting the model parameter space, but previous study suggests that this choice leads to clear differences in inundation probabilities. This study aims to address the evaluation of the reliability of these probabilistic predictions. However, a lack of an adequate number of observations of flood inundation for a catchment limits the application of conventional methods of evaluating predictive reliability. Consequently, attempts have been made to assess the reliability of probabilistic predictions using multiple observations from a single flood event. Here, a LISFLOOD-FP hydraulic model of an extreme (>1 in 1000 years) flood event in Cockermouth, UK, is constructed and calibrated using multiple performance measures from both peak flood wrack mark data and aerial photography captured post-peak. These measures are used in weighting the parameter space to produce multiple probabilistic predictions for the event. Two methods of assessing the reliability of these probabilistic predictions using limited observations are utilized; an existing method assessing the binary pattern of flooding, and a method developed in this paper to assess predictions of water surface elevation. This study finds that the water surface elevation method has both a better diagnostic and discriminatory ability, but this result is likely to be sensitive to the unknown uncertainties in the upstream boundary condition
Resumo:
A new dynamic model of water quality, Q(2), has recently been developed, capable of simulating large branched river systems. This paper describes the application of a generalized sensitivity analysis (GSA) to Q(2) for single reaches of the River Thames in southern England. Focusing on the simulation of dissolved oxygen (DO) (since this may be regarded as a proxy for the overall health of a river); the GSA is used to identify key parameters controlling model behavior and provide a probabilistic procedure for model calibration. It is shown that, in the River Thames at least, it is more important to obtain high quality forcing functions than to obtain improved parameter estimates once approximate values have been estimated. Furthermore, there is a need to ensure reasonable simulation of a range of water quality determinands, since a focus only on DO increases predictive uncertainty in the DO simulations. The Q(2) model has been applied here to the River Thames, but it has a broad utility for evaluating other systems in Europe and around the world.
Resumo:
Process-based integrated modelling of weather and crop yield over large areas is becoming an important research topic. The production of the DEMETER ensemble hindcasts of weather allows this work to be carried out in a probabilistic framework. In this study, ensembles of crop yield (groundnut, Arachis hypogaea L.) were produced for 10 2.5 degrees x 2.5 degrees grid cells in western India using the DEMETER ensembles and the general large-area model (GLAM) for annual crops. Four key issues are addressed by this study. First, crop model calibration methods for use with weather ensemble data are assessed. Calibration using yield ensembles was more successful than calibration using reanalysis data (the European Centre for Medium-Range Weather Forecasts 40-yr reanalysis, ERA40). Secondly, the potential for probabilistic forecasting of crop failure is examined. The hindcasts show skill in the prediction of crop failure, with more severe failures being more predictable. Thirdly, the use of yield ensemble means to predict interannual variability in crop yield is examined and their skill assessed relative to baseline simulations using ERA40. The accuracy of multi-model yield ensemble means is equal to or greater than the accuracy using ERA40. Fourthly, the impact of two key uncertainties, sowing window and spatial scale, is briefly examined. The impact of uncertainty in the sowing window is greater with ERA40 than with the multi-model yield ensemble mean. Subgrid heterogeneity affects model accuracy: where correlations are low on the grid scale, they may be significantly positive on the subgrid scale. The implications of the results of this study for yield forecasting on seasonal time-scales are as follows. (i) There is the potential for probabilistic forecasting of crop failure (defined by a threshold yield value); forecasting of yield terciles shows less potential. (ii) Any improvement in the skill of climate models has the potential to translate into improved deterministic yield prediction. (iii) Whilst model input uncertainties are important, uncertainty in the sowing window may not require specific modelling. The implications of the results of this study for yield forecasting on multidecadal (climate change) time-scales are as follows. (i) The skill in the ensemble mean suggests that the perturbation, within uncertainty bounds, of crop and climate parameters, could potentially average out some of the errors associated with mean yield prediction. (ii) For a given technology trend, decadal fluctuations in the yield-gap parameter used by GLAM may be relatively small, implying some predictability on those time-scales.
Resumo:
Graphical tracking is a technique for crop scheduling where the actual plant state is plotted against an ideal target curve which encapsulates all crop and environmental characteristics. Management decisions are made on the basis of the position of the actual crop against the ideal position. Due to the simplicity of the approach it is possible for graphical tracks to be developed on site without the requirement for controlled experimentation. Growth models and graphical tracks are discussed, and an implementation of the Richards curve for graphical tracking described. In many cases, the more intuitively desirable growth models perform sub-optimally due to problems with the specification of starting conditions, environmental factors outside the scope of the original model and the introduction of new cultivars. Accurate specification for a biological model requires detailed and usually costly study, and as such is not adaptable to a changing cultivar range and changing cultivation techniques. Fitting of a new graphical track for a new cultivar can be conducted on site and improved over subsequent seasons. Graphical tracking emphasises the current position relative to the objective, and as such does not require the time consuming or system specific input of an environmental history, although it does require detailed crop measurement. The approach is flexible and could be applied to a variety of specification metrics, with digital imaging providing a route for added value. For decision making regarding crop manipulation from the observed current state, there is a role for simple predictive modelling over the short term to indicate the short term consequences of crop manipulation.
Resumo:
In this paper, we evaluate the Probabilistic Occupancy Map (POM) pedestrian detection algorithm on the PETS 2009 benchmark dataset. POM is a multi-camera generative detection method, which estimates ground plane occupancy from multiple background subtraction views. Occupancy probabilities are iteratively estimated by fitting a synthetic model of the background subtraction to the binary foreground motion. Furthermore, we test the integration of this algorithm into a larger framework designed for understanding human activities in real environments. We demonstrate accurate detection and localization on the PETS dataset, despite suboptimal calibration and foreground motion segmentation input.
Resumo:
We introduce a classification-based approach to finding occluding texture boundaries. The classifier is composed of a set of weak learners, which operate on image intensity discriminative features that are defined on small patches and are fast to compute. A database that is designed to simulate digitized occluding contours of textured objects in natural images is used to train the weak learners. The trained classifier score is then used to obtain a probabilistic model for the presence of texture transitions, which can readily be used for line search texture boundary detection in the direction normal to an initial boundary estimate. This method is fast and therefore suitable for real-time and interactive applications. It works as a robust estimator, which requires a ribbon-like search region and can handle complex texture structures without requiring a large number of observations. We demonstrate results both in the context of interactive 2D delineation and of fast 3D tracking and compare its performance with other existing methods for line search boundary detection.
Resumo:
A new probabilistic neural network (PNN) learning algorithm based on forward constrained selection (PNN-FCS) is proposed. An incremental learning scheme is adopted such that at each step, new neurons, one for each class, are selected from the training samples arid the weights of the neurons are estimated so as to minimize the overall misclassification error rate. In this manner, only the most significant training samples are used as the neurons. It is shown by simulation that the resultant networks of PNN-FCS have good classification performance compared to other types of classifiers, but much smaller model sizes than conventional PNN.
Resumo:
Given a nonlinear model, a probabilistic forecast may be obtained by Monte Carlo simulations. At a given forecast horizon, Monte Carlo simulations yield sets of discrete forecasts, which can be converted to density forecasts. The resulting density forecasts will inevitably be downgraded by model mis-specification. In order to enhance the quality of the density forecasts, one can mix them with the unconditional density. This paper examines the value of combining conditional density forecasts with the unconditional density. The findings have positive implications for issuing early warnings in different disciplines including economics and meteorology, but UK inflation forecasts are considered as an example.