927 resultados para Bayes credible intervals
Resumo:
The aim of many genetic studies is to locate the genomic regions (called quantitative trait loci, QTLs) that contribute to variation in a quantitative trait (such as body weight). Confidence intervals for the locations of QTLs are particularly important for the design of further experiments to identify the gene or genes responsible for the effect. Likelihood support intervals are the most widely used method to obtain confidence intervals for QTL location, but the non-parametric bootstrap has also been recommended. Through extensive computer simulation, we show that bootstrap confidence intervals are poorly behaved and so should not be used in this context. The profile likelihood (or LOD curve) for QTL location has a tendency to peak at genetic markers, and so the distribution of the maximum likelihood estimate (MLE) of QTL location has the unusual feature of point masses at genetic markers; this contributes to the poor behavior of the bootstrap. Likelihood support intervals and approximate Bayes credible intervals, on the other hand, are shown to behave appropriately.
Resumo:
Pós-graduação em Matematica Aplicada e Computacional - FCT
Resumo:
We report on measurements of neutrino oscillation using data from the T2K long-baseline neutrino experiment collected between 2010 and 2013. In an analysis of muon neutrino disappearance alone, we find the following estimates and 68% confidence intervals for the two possible mass hierarchies: Normal Hierarchy: sin²θ₂₃= 0.514+0.055−0.056 and ∆m²_32 = (2.51 ± 0.10) × 10⁻³ eV²/c⁴ Inverted Hierarchy: sin²θ₂₃= 0.511 ± 0.055 and ∆m²_13 = (2.48 ± 0.10) × 10⁻³ eV²/c⁴ The analysis accounts for multi-nucleon mechanisms in neutrino interactions which were found to introduce negligible bias. We describe our first analyses that combine measurements of muon neutrino disappearance and electron neutrino appearance to estimate four oscillation parameters, |∆m^2|, sin²θ₂₃, sin²θ₁₃, δCP , and the mass hierarchy. Frequentist and Bayesian intervals are presented for combinations of these parameters, with and without including recent reactor measurements. At 90% confidence level and including reactor measurements, we exclude the region δCP = [0.15, 0.83]π for normal hierarchy and δCP = [−0.08, 1.09]π for inverted hierarchy. The T2K and reactor data weakly favor the normal hierarchy with a Bayes Factor of 2.2. The most probable values and 68% 1D credible intervals for the other oscillation parameters, when reactor data are included, are: sin²θ₂₃= 0.528+0.055−0.038 and |∆m²_32| = (2.51 ± 0.11) × 10⁻³ eV²/c⁴.
Resumo:
Objective We aimed to predict sub-national spatial variation in numbers of people infected with Schistosoma haematobium, and associated uncertainties, in Burkina Faso, Mali and Niger, prior to implementation of national control programmes. Methods We used national field survey datasets covering a contiguous area 2,750 × 850 km, from 26,790 school-aged children (5–14 years) in 418 schools. Bayesian geostatistical models were used to predict prevalence of high and low intensity infections and associated 95% credible intervals (CrI). Numbers infected were determined by multiplying predicted prevalence by numbers of school-aged children in 1 km2 pixels covering the study area. Findings Numbers of school-aged children with low-intensity infections were: 433,268 in Burkina Faso, 872,328 in Mali and 580,286 in Niger. Numbers with high-intensity infections were: 416,009 in Burkina Faso, 511,845 in Mali and 254,150 in Niger. 95% CrIs (indicative of uncertainty) were wide; e.g. the mean number of boys aged 10–14 years infected in Mali was 140,200 (95% CrI 6200, 512,100). Conclusion National aggregate estimates for numbers infected mask important local variation, e.g. most S. haematobium infections in Niger occur in the Niger River valley. Prevalence of high-intensity infections was strongly clustered in foci in western and central Mali, north-eastern and northwestern Burkina Faso and the Niger River valley in Niger. Populations in these foci are likely to carry the bulk of the urinary schistosomiasis burden and should receive priority for schistosomiasis control. Uncertainties in predicted prevalence and numbers infected should be acknowledged and taken into consideration by control programme planners.
Resumo:
This paper describes the formalization and application of a methodology to evaluate the safety benefit of countermeasures in the face of uncertainty. To illustrate the methodology, 18 countermeasures for improving safety of at grade railroad crossings (AGRXs) in the Republic of Korea are considered. Akin to “stated preference” methods in travel survey research, the methodology applies random selection and laws of large numbers to derive accident modification factor (AMF) densities from expert opinions. In a full Bayesian analysis framework, the collective opinions in the form of AMF densities (data likelihood) are combined with prior knowledge (AMF density priors) for the 18 countermeasures to obtain ‘best’ estimates of AMFs (AMF posterior credible intervals). The countermeasures are then compared and recommended based on the largest safety returns with minimum risk (uncertainty). To the author's knowledge the complete methodology is new and has not previously been applied or reported in the literature. The results demonstrate that the methodology is able to discern anticipated safety benefit differences across candidate countermeasures. For the 18 at grade railroad crossings considered in this analysis, it was found that the top three performing countermeasures for reducing crashes are in-vehicle warning systems, obstacle detection systems, and constant warning time systems.
Resumo:
Estimating potential health risks associated with recycled (reused) water is highly complex given the multiple factors affecting water quality. We take a conceptual model, which represents the factors and pathways by which recycled water may pose a risk of contracting gastroenteritis, convert the conceptual model to a Bayesian net, and quantify the model using one expert’s opinion. This allows us to make various predictions as to the risks posed under various scenarios. Bayesian nets provide an additional way of modeling the determinants of recycled water quality and elucidating their relative influence on a given disease outcome. The important contribution to Bayesian net methodology is that all model predictions, whether risk or relative risk estimates, are expressed as credible intervals.
Resumo:
The research objectives of this thesis were to contribute to Bayesian statistical methodology by contributing to risk assessment statistical methodology, and to spatial and spatio-temporal methodology, by modelling error structures using complex hierarchical models. Specifically, I hoped to consider two applied areas, and use these applications as a springboard for developing new statistical methods as well as undertaking analyses which might give answers to particular applied questions. Thus, this thesis considers a series of models, firstly in the context of risk assessments for recycled water, and secondly in the context of water usage by crops. The research objective was to model error structures using hierarchical models in two problems, namely risk assessment analyses for wastewater, and secondly, in a four dimensional dataset, assessing differences between cropping systems over time and over three spatial dimensions. The aim was to use the simplicity and insight afforded by Bayesian networks to develop appropriate models for risk scenarios, and again to use Bayesian hierarchical models to explore the necessarily complex modelling of four dimensional agricultural data. The specific objectives of the research were to develop a method for the calculation of credible intervals for the point estimates of Bayesian networks; to develop a model structure to incorporate all the experimental uncertainty associated with various constants thereby allowing the calculation of more credible credible intervals for a risk assessment; to model a single day’s data from the agricultural dataset which satisfactorily captured the complexities of the data; to build a model for several days’ data, in order to consider how the full data might be modelled; and finally to build a model for the full four dimensional dataset and to consider the timevarying nature of the contrast of interest, having satisfactorily accounted for possible spatial and temporal autocorrelations. This work forms five papers, two of which have been published, with two submitted, and the final paper still in draft. The first two objectives were met by recasting the risk assessments as directed, acyclic graphs (DAGs). In the first case, we elicited uncertainty for the conditional probabilities needed by the Bayesian net, incorporated these into a corresponding DAG, and used Markov chain Monte Carlo (MCMC) to find credible intervals, for all the scenarios and outcomes of interest. In the second case, we incorporated the experimental data underlying the risk assessment constants into the DAG, and also treated some of that data as needing to be modelled as an ‘errors-invariables’ problem [Fuller, 1987]. This illustrated a simple method for the incorporation of experimental error into risk assessments. In considering one day of the three-dimensional agricultural data, it became clear that geostatistical models or conditional autoregressive (CAR) models over the three dimensions were not the best way to approach the data. Instead CAR models are used with neighbours only in the same depth layer. This gave flexibility to the model, allowing both the spatially structured and non-structured variances to differ at all depths. We call this model the CAR layered model. Given the experimental design, the fixed part of the model could have been modelled as a set of means by treatment and by depth, but doing so allows little insight into how the treatment effects vary with depth. Hence, a number of essentially non-parametric approaches were taken to see the effects of depth on treatment, with the model of choice incorporating an errors-in-variables approach for depth in addition to a non-parametric smooth. The statistical contribution here was the introduction of the CAR layered model, the applied contribution the analysis of moisture over depth and estimation of the contrast of interest together with its credible intervals. These models were fitted using WinBUGS [Lunn et al., 2000]. The work in the fifth paper deals with the fact that with large datasets, the use of WinBUGS becomes more problematic because of its highly correlated term by term updating. In this work, we introduce a Gibbs sampler with block updating for the CAR layered model. The Gibbs sampler was implemented by Chris Strickland using pyMCMC [Strickland, 2010]. This framework is then used to consider five days data, and we show that moisture in the soil for all the various treatments reaches levels particular to each treatment at a depth of 200 cm and thereafter stays constant, albeit with increasing variances with depth. In an analysis across three spatial dimensions and across time, there are many interactions of time and the spatial dimensions to be considered. Hence, we chose to use a daily model and to repeat the analysis at all time points, effectively creating an interaction model of time by the daily model. Such an approach allows great flexibility. However, this approach does not allow insight into the way in which the parameter of interest varies over time. Hence, a two-stage approach was also used, with estimates from the first-stage being analysed as a set of time series. We see this spatio-temporal interaction model as being a useful approach to data measured across three spatial dimensions and time, since it does not assume additivity of the random spatial or temporal effects.
Resumo:
There is currently a lack of reference values for indoor air fungal concentrations to allow for the interpretation of measurement results in subtropical school settings. Analysis of the results of this work established that, in the majority of properly maintained subtropical school buildings, without any major affecting events such as floods or visible mould or moisture contamination, indoor culturable fungi levels were driven by outdoor concentration. The results also allowed us to benchmark the “baseline range” concentrations for total culturable fungi, Penicillium spp., Cladosporium spp. and Aspergillus spp. in such school settings. The measured concentration of total culturable fungi and three individual fungal genera were estimated using Bayesian hierarchical modelling. Pooling of these estimates provided a predictive distribution for concentrations at an unobserved school. The results indicated that “baseline” indoor concentration levels for indoor total fungi, Penicillium spp., Cladosporium spp. and Aspergillus spp. in such school settings were generally ≤ 1450, ≤ 680, ≤ 480 and ≤ 90 cfu/m3, respectively, and elevated levels would indicate mould damage in building structures. The indoor/outdoor ratio for most classrooms had 95% credible intervals containing 1, indicating that fungi concentrations are generally the same indoors and outdoors at each school. Bayesian fixed effects regression modeling showed that increasing both temperature and humidity resulted in higher levels of fungi concentration.
Resumo:
Ce mémoire porte sur la simulation d'intervalles de crédibilité simultanés dans un contexte bayésien. Dans un premier temps, nous nous intéresserons à des données de précipitations et des fonctions basées sur ces données : la fonction de répartition empirique et la période de retour, une fonction non linéaire de la fonction de répartition. Nous exposerons différentes méthodes déjà connues pour obtenir des intervalles de confiance simultanés sur ces fonctions à l'aide d'une base polynomiale et nous présenterons une méthode de simulation d'intervalles de crédibilité simultanés. Nous nous placerons ensuite dans un contexte bayésien en explorant différents modèles de densité a priori. Pour le modèle le plus complexe, nous aurons besoin d'utiliser la simulation Monte-Carlo pour obtenir les intervalles de crédibilité simultanés a posteriori. Finalement, nous utiliserons une base non linéaire faisant appel à la transformation angulaire et aux splines monotones pour obtenir un intervalle de crédibilité simultané valide pour la période de retour.
Resumo:
A study to monitor boreal songbird trends was initiated in 1998 in a relatively undisturbed and remote part of the boreal forest in the Northwest Territories, Canada. Eight years of point count data were collected over the 14 years of the study, 1998-2011. Trends were estimated for 50 bird species using generalized linear mixed-effects models, with random effects to account for temporal (repeat sampling within years) and spatial (stations within stands) autocorrelation and variability associated with multiple observers. We tested whether regional and national Breeding Bird Survey (BBS) trends could, on average, predict trends in our study area. Significant increases in our study area outnumbered decreases by 12 species to 6, an opposite pattern compared to Alberta (6 versus 15, respectively) and Canada (9 versus 20). Twenty-two species with relatively precise trend estimates (precision to detect > 30% decline in 10 years; observed SE ≤ 3.7%/year) showed nonsignificant trends, similar to Alberta (24) and Canada (20). Precision-weighted trends for a sample of 19 species with both reliable trends at our site and small portions of their range covered by BBS in Canada were, on average, more negative for Alberta (1.34% per year lower) and for Canada (1.15% per year lower) relative to Fort Liard, though 95% credible intervals still contained zero. We suggest that part of the differences could be attributable to local resource pulses (insect outbreak). However, we also suggest that the tendency for BBS route coverage to disproportionately sample more southerly, developed areas in the boreal forest could result in BBS trends that are not representative of range-wide trends for species whose range is centred farther north.
Resumo:
Technical efficiency is estimated and examined for a cross-section of Australian dairy farms using various frontier methodologies; Bayesian and Classical stochastic frontiers, and Data Envelopment Analysis. The results indicate technical inefficiency is present in the sample data. Also identified are statistical differences between the point estimates of technical efficiency generated by the various methodologies. However, the rank of farm level technical efficiency is statistically invariant to the estimation technique employed. Finally, when confidence/credible intervals of technical efficiency are compared significant overlap is found for many of the farms' intervals for all frontier methods employed. The results indicate that the choice of estimation methodology may matter, but the explanatory power of all frontier methods is significantly weaker when interval estimate of technical efficiency is examined.
Resumo:
Objectives To model the impact on chronic disease of a tax on UK food and drink that internalises the wider costs to society of greenhouse gas (GHG) emissions and to estimate the potential revenue. Design An econometric and comparative risk assessment modelling study. Setting The UK. Participants The UK adult population. Interventions Two tax scenarios are modelled: (A) a tax of £2.72/tonne carbon dioxide equivalents (tCO2e)/100 g product applied to all food and drink groups with above average GHG emissions. (B) As with scenario (A) but food groups with emissions below average are subsidised to create a tax neutral scenario. Outcome measures Primary outcomes are change in UK population mortality from chronic diseases following the implementation of each taxation strategy, the change in the UK GHG emissions and the predicted revenue. Secondary outcomes are the changes to the micronutrient composition of the UK diet. Results Scenario (A) results in 7770 (95% credible intervals 7150 to 8390) deaths averted and a reduction in GHG emissions of 18 683 (14 665to 22 889) ktCO2e/year. Estimated annual revenue is £2.02 (£1.98 to £2.06) billion. Scenario (B) results in 2685 (1966 to 3402) extra deaths and a reduction in GHG emissions of 15 228 (11 245to 19 492) ktCO2e/year. Conclusions Incorporating the societal cost of GHG into the price of foods could save 7770 lives in the UK each year, reduce food-related GHG emissions and generate substantial tax revenue. The revenue neutral scenario (B) demonstrates that sustainability and health goals are not always aligned. Future work should focus on investigating the health impact by population subgroup and on designing fiscal strategies to promote both sustainable and healthy diets.
Resumo:
This paper investigates the feasibility of using approximate Bayesian computation (ABC) to calibrate and evaluate complex individual-based models (IBMs). As ABC evolves, various versions are emerging, but here we only explore the most accessible version, rejection-ABC. Rejection-ABC involves running models a large number of times, with parameters drawn randomly from their prior distributions, and then retaining the simulations closest to the observations. Although well-established in some fields, whether ABC will work with ecological IBMs is still uncertain. Rejection-ABC was applied to an existing 14-parameter earthworm energy budget IBM for which the available data consist of body mass growth and cocoon production in four experiments. ABC was able to narrow the posterior distributions of seven parameters, estimating credible intervals for each. ABC’s accepted values produced slightly better fits than literature values do. The accuracy of the analysis was assessed using cross-validation and coverage, currently the best available tests. Of the seven unnarrowed parameters, ABC revealed that three were correlated with other parameters, while the remaining four were found to be not estimable given the data available. It is often desirable to compare models to see whether all component modules are necessary. Here we used ABC model selection to compare the full model with a simplified version which removed the earthworm’s movement and much of the energy budget. We are able to show that inclusion of the energy budget is necessary for a good fit to the data. We show how our methodology can inform future modelling cycles, and briefly discuss how more advanced versions of ABC may be applicable to IBMs. We conclude that ABC has the potential to represent uncertainty in model structure, parameters and predictions, and to embed the often complex process of optimizing an IBM’s structure and parameters within an established statistical framework, thereby making the process more transparent and objective.
Resumo:
The generalized exponential distribution, proposed by Gupta and Kundu (1999), is a good alternative to standard lifetime distributions as exponential, Weibull or gamma. Several authors have considered the problem of Bayesian estimation of the parameters of generalized exponential distribution, assuming independent gamma priors and other informative priors. In this paper, we consider a Bayesian analysis of the generalized exponential distribution by assuming the conventional non-informative prior distributions, as Jeffreys and reference prior, to estimate the parameters. These priors are compared with independent gamma priors for both parameters. The comparison is carried out by examining the frequentist coverage probabilities of Bayesian credible intervals. We shown that maximal data information prior implies in an improper posterior distribution for the parameters of a generalized exponential distribution. It is also shown that the choice of a parameter of interest is very important for the reference prior. The different choices lead to different reference priors in this case. Numerical inference is illustrated for the parameters by considering data set of different sizes and using MCMC (Markov Chain Monte Carlo) methods.
Resumo:
In this paper distinct prior distributions are derived in a Bayesian inference of the two-parameters Gamma distribution. Noniformative priors, such as Jeffreys, reference, MDIP, Tibshirani and an innovative prior based on the copula approach are investigated. We show that the maximal data information prior provides in an improper posterior density and that the different choices of the parameter of interest lead to different reference priors in this case. Based on the simulated data sets, the Bayesian estimates and credible intervals for the unknown parameters are computed and the performance of the prior distributions are evaluated. The Bayesian analysis is conducted using the Markov Chain Monte Carlo (MCMC) methods to generate samples from the posterior distributions under the above priors.