987 resultados para Bayesian Modelling
Resumo:
We consider the forecasting of macroeconomic variables that are subject to revisions, using Bayesian vintage-based vector autoregressions. The prior incorporates the belief that, after the first few data releases, subsequent ones are likely to consist of revisions that are largely unpredictable. The Bayesian approach allows the joint modelling of the data revisions of more than one variable, while keeping the concomitant increase in parameter estimation uncertainty manageable. Our model provides markedly more accurate forecasts of post-revision values of inflation than do other models in the literature.
Resumo:
This paper investigates the feasibility of using approximate Bayesian computation (ABC) to calibrate and evaluate complex individual-based models (IBMs). As ABC evolves, various versions are emerging, but here we only explore the most accessible version, rejection-ABC. Rejection-ABC involves running models a large number of times, with parameters drawn randomly from their prior distributions, and then retaining the simulations closest to the observations. Although well-established in some fields, whether ABC will work with ecological IBMs is still uncertain. Rejection-ABC was applied to an existing 14-parameter earthworm energy budget IBM for which the available data consist of body mass growth and cocoon production in four experiments. ABC was able to narrow the posterior distributions of seven parameters, estimating credible intervals for each. ABC’s accepted values produced slightly better fits than literature values do. The accuracy of the analysis was assessed using cross-validation and coverage, currently the best available tests. Of the seven unnarrowed parameters, ABC revealed that three were correlated with other parameters, while the remaining four were found to be not estimable given the data available. It is often desirable to compare models to see whether all component modules are necessary. Here we used ABC model selection to compare the full model with a simplified version which removed the earthworm’s movement and much of the energy budget. We are able to show that inclusion of the energy budget is necessary for a good fit to the data. We show how our methodology can inform future modelling cycles, and briefly discuss how more advanced versions of ABC may be applicable to IBMs. We conclude that ABC has the potential to represent uncertainty in model structure, parameters and predictions, and to embed the often complex process of optimizing an IBM’s structure and parameters within an established statistical framework, thereby making the process more transparent and objective.
Resumo:
Land cover data derived from satellites are commonly used to prescribe inputs to models of the land surface. Since such data inevitably contains errors, quantifying how uncertainties in the data affect a model’s output is important. To do so, a spatial distribution of possible land cover values is required to propagate through the model’s simulation. However, at large scales, such as those required for climate models, such spatial modelling can be difficult. Also, computer models often require land cover proportions at sites larger than the original map scale as inputs, and it is the uncertainty in these proportions that this article discusses. This paper describes a Monte Carlo sampling scheme that generates realisations of land cover proportions from the posterior distribution as implied by a Bayesian analysis that combines spatial information in the land cover map and its associated confusion matrix. The technique is computationally simple and has been applied previously to the Land Cover Map 2000 for the region of England and Wales. This article demonstrates the ability of the technique to scale up to large (global) satellite derived land cover maps and reports its application to the GlobCover 2009 data product. The results show that, in general, the GlobCover data possesses only small biases, with the largest belonging to non–vegetated surfaces. In vegetated surfaces, the most prominent area of uncertainty is Southern Africa, which represents a complex heterogeneous landscape. It is also clear from this study that greater resources need to be devoted to the construction of comprehensive confusion matrices.
Resumo:
In this paper, we compare the performance of two statistical approaches for the analysis of data obtained from the social research area. In the first approach, we use normal models with joint regression modelling for the mean and for the variance heterogeneity. In the second approach, we use hierarchical models. In the first case, individual and social variables are included in the regression modelling for the mean and for the variance, as explanatory variables, while in the second case, the variance at level 1 of the hierarchical model depends on the individuals (age of the individuals), and in the level 2 of the hierarchical model, the variance is assumed to change according to socioeconomic stratum. Applying these methodologies, we analyze a Colombian tallness data set to find differences that can be explained by socioeconomic conditions. We also present some theoretical and empirical results concerning the two models. From this comparative study, we conclude that it is better to jointly modelling the mean and variance heterogeneity in all cases. We also observe that the convergence of the Gibbs sampling chain used in the Markov Chain Monte Carlo method for the jointly modeling the mean and variance heterogeneity is quickly achieved.
Resumo:
Linear mixed effects models are frequently used to analyse longitudinal data, due to their flexibility in modelling the covariance structure between and within observations. Further, it is easy to deal with unbalanced data, either with respect to the number of observations per subject or per time period, and with varying time intervals between observations. In most applications of mixed models to biological sciences, a normal distribution is assumed both for the random effects and for the residuals. This, however, makes inferences vulnerable to the presence of outliers. Here, linear mixed models employing thick-tailed distributions for robust inferences in longitudinal data analysis are described. Specific distributions discussed include the Student-t, the slash and the contaminated normal. A Bayesian framework is adopted, and the Gibbs sampler and the Metropolis-Hastings algorithms are used to carry out the posterior analyses. An example with data on orthodontic distance growth in children is discussed to illustrate the methodology. Analyses based on either the Student-t distribution or on the usual Gaussian assumption are contrasted. The thick-tailed distributions provide an appealing robust alternative to the Gaussian process for modelling distributions of the random effects and of residuals in linear mixed models, and the MCMC implementation allows the computations to be performed in a flexible manner.
Resumo:
In the context of Bayesian statistical analysis, elicitation is the process of formulating a prior density f(.) about one or more uncertain quantities to represent a person's knowledge and beliefs. Several different methods of eliciting prior distributions for one unknown parameter have been proposed. However, there are relatively few methods for specifying a multivariate prior distribution and most are just applicable to specific classes of problems and/or based on restrictive conditions, such as independence of variables. Besides, many of these procedures require the elicitation of variances and correlations, and sometimes elicitation of hyperparameters which are difficult for experts to specify in practice. Garthwaite et al. (2005) discuss the different methods proposed in the literature and the difficulties of eliciting multivariate prior distributions. We describe a flexible method of eliciting multivariate prior distributions applicable to a wide class of practical problems. Our approach does not assume a parametric form for the unknown prior density f(.), instead we use nonparametric Bayesian inference, modelling f(.) by a Gaussian process prior distribution. The expert is then asked to specify certain summaries of his/her distribution, such as the mean, mode, marginal quantiles and a small number of joint probabilities. The analyst receives that information, treating it as a data set D with which to update his/her prior beliefs to obtain the posterior distribution for f(.). Theoretical properties of joint and marginal priors are derived and numerical illustrations to demonstrate our approach are given. (C) 2010 Elsevier B.V. All rights reserved.
Resumo:
The disturbance vicariance hypothesis (DV) has been proposed to explain speciation in Amazonia, especially its edge regions, e. g. in eastern Guiana Shield harlequin frogs (Atelopus) which are suggested to have derived from a cool-adapted Andean ancestor. In concordance with DV predictions we studied that (i) these amphibians display a natural distribution gap in central Amazonia; (ii) east of this gap they constitute a monophyletic lineage which is nested within a pre-Andean/western clade; (iii) climate envelopes of Atelopus west and east of the distribution gap show some macroclimatic divergence due to a regional climate envelope shift; (iv) geographic distributions of climate envelopes of western and eastern Atelopus range into central Amazonia but with limited spatial overlap. We tested if presence and apparent absence data points of Atelopus were homogenously distributed with Ripley's K function. A molecular phylogeny (mitochondrial 16S rRNA gene) was reconstructed using Maximum Likelihood and Bayesian Inference to study if Guianan Atelopus constitute a clade nested within a larger genus phylogeny. We focused on climate envelope divergence and geographic distribution by computing climatic envelope models with MaxEnt based on macroscale bioclimatic parameters and testing them by using Schoener's index and modified Hellinger distance. We corroborated existing DV predictions and, for the first time, formulated new DV predictions aiming on species' climate envelope change. Our results suggest that cool-adapted Andean Atelopus ancestors had dispersed into the Amazon basin and further onto the eastern Guiana Shield where, under warm conditions, they were forced to change climate envelopes. © 2010 The Author(s).
Resumo:
This thesis presents Bayesian solutions to inference problems for three types of social network data structures: a single observation of a social network, repeated observations on the same social network, and repeated observations on a social network developing through time. A social network is conceived as being a structure consisting of actors and their social interaction with each other. A common conceptualisation of social networks is to let the actors be represented by nodes in a graph with edges between pairs of nodes that are relationally tied to each other according to some definition. Statistical analysis of social networks is to a large extent concerned with modelling of these relational ties, which lends itself to empirical evaluation. The first paper deals with a family of statistical models for social networks called exponential random graphs that takes various structural features of the network into account. In general, the likelihood functions of exponential random graphs are only known up to a constant of proportionality. A procedure for performing Bayesian inference using Markov chain Monte Carlo (MCMC) methods is presented. The algorithm consists of two basic steps, one in which an ordinary Metropolis-Hastings up-dating step is used, and another in which an importance sampling scheme is used to calculate the acceptance probability of the Metropolis-Hastings step. In paper number two a method for modelling reports given by actors (or other informants) on their social interaction with others is investigated in a Bayesian framework. The model contains two basic ingredients: the unknown network structure and functions that link this unknown network structure to the reports given by the actors. These functions take the form of probit link functions. An intrinsic problem is that the model is not identified, meaning that there are combinations of values on the unknown structure and the parameters in the probit link functions that are observationally equivalent. Instead of using restrictions for achieving identification, it is proposed that the different observationally equivalent combinations of parameters and unknown structure be investigated a posteriori. Estimation of parameters is carried out using Gibbs sampling with a switching devise that enables transitions between posterior modal regions. The main goal of the procedures is to provide tools for comparisons of different model specifications. Papers 3 and 4, propose Bayesian methods for longitudinal social networks. The premise of the models investigated is that overall change in social networks occurs as a consequence of sequences of incremental changes. Models for the evolution of social networks using continuos-time Markov chains are meant to capture these dynamics. Paper 3 presents an MCMC algorithm for exploring the posteriors of parameters for such Markov chains. More specifically, the unobserved evolution of the network in-between observations is explicitly modelled thereby avoiding the need to deal with explicit formulas for the transition probabilities. This enables likelihood based parameter inference in a wider class of network evolution models than has been available before. Paper 4 builds on the proposed inference procedure of Paper 3 and demonstrates how to perform model selection for a class of network evolution models.
Resumo:
This work describes the probabilistic modelling af a Bayesian-based mechanism to improve location estimates of an already deployed location system by fusing its outputs with low-cost binary sensors. This mechanism takes advantege of the localization captabilities of different technologies usually present in smart environments deployments. The performance of the proposed algorithm over a real sensor deployment is evaluated using simulated and real experimental data.
Resumo:
A participatory modelling process has been conducted in two areas of the Guadiana river (the upper and the middle sub-basins), in Spain, with the aim of providing support for decision making in the water management field. The area has a semi-arid climate where irrigated agriculture plays a key role in the economic development of the region and accounts for around 90% of water use. Following the guidelines of the European Water Framework Directive, we promote stakeholder involvement in water management with the aim to achieve an improved understanding of the water system and to encourage the exchange of knowledge and views between stakeholders in order to help building a shared vision of the system. At the same time, the resulting models, which integrate the different sectors and views, provide some insight of the impacts that different management options and possible future scenarios could have. The methodology is based on a Bayesian network combined with an economic model and, in the middle Guadiana sub-basin, with a crop model. The resulting integrated modelling framework is used to simulate possible water policy, market and climate scenarios to find out the impacts of those scenarios on farm income and on the environment. At the end of the modelling process, an evaluation questionnaire was filled by participants in both sub-basins. Results show that this type of processes are found very helpful by stakeholders to improve the system understanding, to understand each others views and to reduce conflict when it exists. In addition, they found the model an extremely useful tool to support management. The graphical interface, the quantitative output and the explicit representation of uncertainty helped stakeholders to better understand the implications of the scenario tested. Finally, the combination of different types of models was also found very useful, as it allowed exploring in detail specific aspects of the water management problems.
Validation of the Swiss methane emission inventory by atmospheric observations and inverse modelling
Resumo:
Atmospheric inverse modelling has the potential to provide observation-based estimates of greenhouse gas emissions at the country scale, thereby allowing for an independent validation of national emission inventories. Here, we present a regional-scale inverse modelling study to quantify the emissions of methane (CH₄) from Switzerland, making use of the newly established CarboCount-CH measurement network and a high-resolution Lagrangian transport model. In our reference inversion, prior emissions were taken from the "bottom-up" Swiss Greenhouse Gas Inventory (SGHGI) as published by the Swiss Federal Office for the Environment in 2014 for the year 2012. Overall we estimate national CH₄ emissions to be 196 ± 18 Gg yr⁻¹ for the year 2013 (1σ uncertainty). This result is in close agreement with the recently revised SGHGI estimate of 206 ± 33 Gg yr⁻¹ as reported in 2015 for the year 2012. Results from sensitivity inversions using alternative prior emissions, uncertainty covariance settings, large-scale background mole fractions, two different inverse algorithms (Bayesian and extended Kalman filter), and two different transport models confirm the robustness and independent character of our estimate. According to the latest SGHGI estimate the main CH₄ source categories in Switzerland are agriculture (78 %), waste handling (15 %) and natural gas distribution and combustion (6 %). The spatial distribution and seasonal variability of our posterior emissions suggest an overestimation of agricultural CH₄ emissions by 10 to 20 % in the most recent SGHGI, which is likely due to an overestimation of emissions from manure handling. Urban areas do not appear as emission hotspots in our posterior results, suggesting that leakages from natural gas distribution are only a minor source of CH₄ in Switzerland. This is consistent with rather low emissions of 8.4 Gg yr⁻¹ reported by the SGHGI but inconsistent with the much higher value of 32 Gg yr⁻¹ implied by the EDGARv4.2 inventory for this sector. Increased CH₄ emissions (up to 30 % compared to the prior) were deduced for the north-eastern parts of Switzerland. This feature was common to most sensitivity inversions, which is a strong indicator that it is a real feature and not an artefact of the transport model and the inversion system. However, it was not possible to assign an unambiguous source process to the region. The observations of the CarboCount-CH network provided invaluable and independent information for the validation of the national bottom-up inventory. Similar systems need to be sustained to provide independent monitoring of future climate agreements.
Resumo:
The effect of the tumour-forming disease, fibropapillomatosis, on the somatic growth dynamics of green turtles resident in the Pala'au foraging grounds (Moloka'i, Hawai'i) was evaluated using a Bayesian generalised additive mixed modelling approach. This regression model enabled us to account for fixed effects (fibropapilloma tumour severity), nonlinear covariate functional form (carapace size, sampling year) as well as random effects due to individual heterogeneity and correlation between repeated growth measurements on some turtles. Somatic growth rates were found to be nonlinear functions of carapace size and sampling year but were not a function of low-to-moderate tumour severity. On the other hand, growth rates were significantly lower for turtles with advanced fibropapillomatosis, which suggests a limited or threshold-specific disease effect. However, tumour severity was an increasing function of carapace size-larger turtles tended to have higher tumour severity scores, presumably due to longer exposure of larger (older) turtles to the factors that cause the disease. Hence turtles with advanced fibropapillomatosis tended to be the larger turtles, which confounds size and tumour severity in this study. But somatic growth rates for the Pala'au population have also declined since the mid-1980s (sampling year effect) while disease prevalence and severity increased from the mid-1980s before levelling off by the mid-1990s. It is unlikely that this decline was related to the increasing tumour severity because growth rates have also declined over the last 10-20 years for other green turtle populations resident in Hawaiian waters that have low or no disease prevalence. The declining somatic growth rate trends evident in the Hawaiian stock are more likely a density-dependent effect caused by a dramatic increase in abundance by this once-seriously-depleted stock since the mid-1980s. So despite increasing fibropapillomatosis risk over the last 20 years, only a limited effect on somatic growth dynamics was apparent and the Hawaiian green turtle stock continues to increase in abundance.
Pharmacokinetic-pharmacodynamic modelling of QT interval prolongation following citalopram overdoses
Resumo:
Aims To develop a pharmacokinetic-pharmacodynamic model describing the time-course of QT interval prolongation after citalopram overdose and to evaluate the effect of charcoal on the relative risk of developing abnormal QT and heart-rate combinations. Methods Plasma concentrations and electrocardiograph (ECG) data from 52 patients after 62 citalopram overdose events were analysed in WinBUGS using a Bayesian approach. The reported doses ranged from 20 to 1700 mg and on 17 of the events a single dose of activated charcoal was administered. The developed pharmacokinetic-pharmacodynamic model was used for predicting the probability of having abnormal combinations of QT-RR, which was assumed to be related to an increased risk for torsade de pointes (TdP). Results The absolute QT interval was related to the observed heart rate with an estimated individual heart-rate correction factor [alpha = 0.36, between-subject coefficient of variation (CV) = 29%]. The heart-rate corrected QT interval was linearly dependent on the predicted citalopram concentration (slope = 40 ms l mg(-1), between-subject CV = 70%) in a hypothetical effect-compartment (half-life of effect-delay = 1.4 h). The heart-rate corrected QT was predicted to be higher in women than in men and to increase with age. Administration of activated charcoal resulted in a pronounced reduction of the QT prolongation and was shown to reduce the risk of having abnormal combinations of QT-RR by approximately 60% for citalopram doses above 600 mg. Conclusion Citalopram caused a delayed lengthening of the QT interval. Administration of activated charcoal was shown to reduce the risk that the QT interval exceeds a previously defined threshold and therefore is expected to reduce the risk of TdP.
Resumo:
Ecological regions are increasingly used as a spatial unit for planning and environmental management. It is important to define these regions in a scientifically defensible way to justify any decisions made on the basis that they are representative of broad environmental assets. The paper describes a methodology and tool to identify cohesive bioregions. The methodology applies an elicitation process to obtain geographical descriptions for bioregions, each of these is transformed into a Normal density estimate on environmental variables within that region. This prior information is balanced with data classification of environmental datasets using a Bayesian statistical modelling approach to objectively map ecological regions. The method is called model-based clustering as it fits a Normal mixture model to the clusters associated with regions, and it addresses issues of uncertainty in environmental datasets due to overlapping clusters.
Resumo:
This paper reports preliminary progress on a principled approach to modelling nonstationary phenomena using neural networks. We are concerned with both parameter and model order complexity estimation. The basic methodology assumes a Bayesian foundation. However to allow the construction of pragmatic models, successive approximations have to be made to permit computational tractibility. The lowest order corresponds to the (Extended) Kalman filter approach to parameter estimation which has already been applied to neural networks. We illustrate some of the deficiencies of the existing approaches and discuss our preliminary generalisations, by considering the application to nonstationary time series.