942 resultados para hierarchical Bayesian models
Resumo:
Background The estimation of demographic parameters from genetic data often requires the computation of likelihoods. However, the likelihood function is computationally intractable for many realistic evolutionary models, and the use of Bayesian inference has therefore been limited to very simple models. The situation changed recently with the advent of Approximate Bayesian Computation (ABC) algorithms allowing one to obtain parameter posterior distributions based on simulations not requiring likelihood computations. Results Here we present ABCtoolbox, a series of open source programs to perform Approximate Bayesian Computations (ABC). It implements various ABC algorithms including rejection sampling, MCMC without likelihood, a Particle-based sampler and ABC-GLM. ABCtoolbox is bundled with, but not limited to, a program that allows parameter inference in a population genetics context and the simultaneous use of different types of markers with different ploidy levels. In addition, ABCtoolbox can also interact with most simulation and summary statistics computation programs. The usability of the ABCtoolbox is demonstrated by inferring the evolutionary history of two evolutionary lineages of Microtus arvalis. Using nuclear microsatellites and mitochondrial sequence data in the same estimation procedure enabled us to infer sex-specific population sizes and migration rates and to find that males show smaller population sizes but much higher levels of migration than females. Conclusion ABCtoolbox allows a user to perform all the necessary steps of a full ABC analysis, from parameter sampling from prior distributions, data simulations, computation of summary statistics, estimation of posterior distributions, model choice, validation of the estimation procedure, and visualization of the results.
Resumo:
Published evidence suggests that aspects of trial design lead to biased intervention effect estimates, but findings from different studies are inconsistent. This study combined data from 7 meta-epidemiologic studies and removed overlaps to derive a final data set of 234 unique meta-analyses containing 1973 trials. Outcome measures were classified as "mortality," "other objective," "or subjective," and Bayesian hierarchical models were used to estimate associations of trial characteristics with average bias and between-trial heterogeneity. Intervention effect estimates seemed to be exaggerated in trials with inadequate or unclear (vs. adequate) random-sequence generation (ratio of odds ratios, 0.89 [95% credible interval {CrI}, 0.82 to 0.96]) and with inadequate or unclear (vs. adequate) allocation concealment (ratio of odds ratios, 0.93 [CrI, 0.87 to 0.99]). Lack of or unclear double-blinding (vs. double-blinding) was associated with an average of 13% exaggeration of intervention effects (ratio of odds ratios, 0.87 [CrI, 0.79 to 0.96]), and between-trial heterogeneity was increased for such studies (SD increase in heterogeneity, 0.14 [CrI, 0.02 to 0.30]). For each characteristic, average bias and increases in between-trial heterogeneity were driven primarily by trials with subjective outcomes, with little evidence of bias in trials with objective and mortality outcomes. This study is limited by incomplete trial reporting, and findings may be confounded by other study design characteristics. Bias associated with study design characteristics may lead to exaggeration of intervention effect estimates and increases in between-trial heterogeneity in trials reporting subjectively assessed outcomes.
Resumo:
Under a two-level hierarchical model, suppose that the distribution of the random parameter is known or can be estimated well. Data are generated via a fixed, but unobservable realization of this parameter. In this paper, we derive the smallest confidence region of the random parameter under a joint Bayesian/frequentist paradigm. On average this optimal region can be much smaller than the corresponding Bayesian highest posterior density region. The new estimation procedure is appealing when one deals with data generated under a highly parallel structure, for example, data from a trial with a large number of clinical centers involved or genome-wide gene-expession data for estimating individual gene- or center-specific parameters simultaneously. The new proposal is illustrated with a typical microarray data set and its performance is examined via a small simulation study.
Resumo:
Traffic particle concentrations show considerable spatial variability within a metropolitan area. We consider latent variable semiparametric regression models for modeling the spatial and temporal variability of black carbon and elemental carbon concentrations in the greater Boston area. Measurements of these pollutants, which are markers of traffic particles, were obtained from several individual exposure studies conducted at specific household locations as well as 15 ambient monitoring sites in the city. The models allow for both flexible, nonlinear effects of covariates and for unexplained spatial and temporal variability in exposure. In addition, the different individual exposure studies recorded different surrogates of traffic particles, with some recording only outdoor concentrations of black or elemental carbon, some recording indoor concentrations of black carbon, and others recording both indoor and outdoor concentrations of black carbon. A joint model for outdoor and indoor exposure that specifies a spatially varying latent variable provides greater spatial coverage in the area of interest. We propose a penalised spline formation of the model that relates to generalised kringing of the latent traffic pollution variable and leads to a natural Bayesian Markov Chain Monte Carlo algorithm for model fitting. We propose methods that allow us to control the degress of freedom of the smoother in a Bayesian framework. Finally, we present results from an analysis that applies the model to data from summer and winter separately
Resumo:
Latent class regression models are useful tools for assessing associations between covariates and latent variables. However, evaluation of key model assumptions cannot be performed using methods from standard regression models due to the unobserved nature of latent outcome variables. This paper presents graphical diagnostic tools to evaluate whether or not latent class regression models adhere to standard assumptions of the model: conditional independence and non-differential measurement. An integral part of these methods is the use of a Markov Chain Monte Carlo estimation procedure. Unlike standard maximum likelihood implementations for latent class regression model estimation, the MCMC approach allows us to calculate posterior distributions and point estimates of any functions of parameters. It is this convenience that allows us to provide the diagnostic methods that we introduce. As a motivating example we present an analysis focusing on the association between depression and socioeconomic status, using data from the Epidemiologic Catchment Area study. We consider a latent class regression analysis investigating the association between depression and socioeconomic status measures, where the latent variable depression is regressed on education and income indicators, in addition to age, gender, and marital status variables. While the fitted latent class regression model yields interesting results, the model parameters are found to be invalid due to the violation of model assumptions. The violation of these assumptions is clearly identified by the presented diagnostic plots. These methods can be applied to standard latent class and latent class regression models, and the general principle can be extended to evaluate model assumptions in other types of models.
Resumo:
Objective. To examine effects of primary care physicians (PCPs) and patients on the association between charges for primary care and specialty care in a point-of-service (POS) health plan. Data Source. Claims from 1996 for 3,308 adult male POS plan members, each of whom was assigned to one of the 50 family practitioner-PCPs with the largest POS plan member-loads. Study Design. A hierarchical multivariate two-part model was fitted using a Gibbs sampler to estimate PCPs' effects on patients' annual charges for two types of services, primary care and specialty care, the associations among PCPs' effects, and within-patient associations between charges for the two services. Adjusted Clinical Groups (ACGs) were used to adjust for case-mix. Principal Findings. PCPs with higher case-mix adjusted rates of specialist use were less likely to see their patients at least once during the year (estimated correlation: –.40; 95% CI: –.71, –.008) and provided fewer services to patients that they saw (estimated correlation: –.53; 95% CI: –.77, –.21). Ten of 11 PCPs whose case-mix adjusted effects on primary care charges were significantly less than or greater than zero (p < .05) had estimated, case-mix adjusted effects on specialty care charges that were of opposite sign (but not significantly different than zero). After adjustment for ACG and PCP effects, the within-patient, estimated odds ratio for any use of primary care given any use of specialty care was .57 (95% CI: .45, .73). Conclusions. PCPs and patients contributed independently to a trade-off between utilization of primary care and specialty care. The trade-off appeared to partially offset significant differences in the amount of care provided by PCPs. These findings were possible because we employed a hierarchical multivariate model rather than separate univariate models.
Resumo:
Time series models relating short-term changes in air pollution levels to daily mortality counts typically assume that the effects of air pollution on the log relative rate of mortality do not vary with time. However, these short-term effects might plausibly vary by season. Changes in the sources of air pollution and meteorology can result in changes in characteristics of the air pollution mixture across seasons. The authors develop Bayesian semi-parametric hierarchical models for estimating time-varying effects of pollution on mortality in multi-site time series studies. The methods are applied to the updated National Morbidity and Mortality Air Pollution Study database for the period 1987--2000, which includes data for 100 U.S. cities. At the national level, a 10 micro-gram/m3 increase in PM(10) at lag 1 is associated with a 0.15 (95% posterior interval: -0.08, 0.39),0.14 (-0.14, 0.42), 0.36 (0.11, 0.61), and 0.14 (-0.06, 0.34) percent increase in mortality for winter, spring, summer, and fall, respectively. An analysis by geographical regions finds a strong seasonal pattern in the northeast (with a peak in summer) and little seasonal variation in the southern regions of the country. These results provide useful information for understanding particle toxicity and guiding future analyses of particle constituent data.
Resumo:
Studies of diagnostic accuracy require more sophisticated methods for their meta-analysis than studies of therapeutic interventions. A number of different, and apparently divergent, methods for meta-analysis of diagnostic studies have been proposed, including two alternative approaches that are statistically rigorous and allow for between-study variability: the hierarchical summary receiver operating characteristic (ROC) model (Rutter and Gatsonis, 2001) and bivariate random-effects meta-analysis (van Houwelingen and others, 1993), (van Houwelingen and others, 2002), (Reitsma and others, 2005). We show that these two models are very closely related, and define the circumstances in which they are identical. We discuss the different forms of summary model output suggested by the two approaches, including summary ROC curves, summary points, confidence regions, and prediction regions.
Resumo:
An appropriate model of recent human evolution is not only important to understand our own history, but it is necessary to disentangle the effects of demography and selection on genome diversity. Although most genetic data support the view that our species originated recently in Africa, it is still unclear if it completely replaced former members of the Homo genus, or if some interbreeding occurred during its range expansion. Several scenarios of modern human evolution have been proposed on the basis of molecular and paleontological data, but their likelihood has never been statistically assessed. Using DNA data from 50 nuclear loci sequenced in African, Asian and Native American samples, we show here by extensive simulations that a simple African replacement model with exponential growth has a higher probability (78%) as compared with alternative multiregional evolution or assimilation scenarios. A Bayesian analysis of the data under this best supported model points to an origin of our species approximately 141 thousand years ago (Kya), an exit out-of-Africa approximately 51 Kya, and a recent colonization of the Americas approximately 10.5 Kya. We also find that the African replacement model explains not only the shallow ancestry of mtDNA or Y-chromosomes but also the occurrence of deep lineages at some autosomal loci, which has been formerly interpreted as a sign of interbreeding with Homo erectus.
Resumo:
In this thesis, we consider Bayesian inference on the detection of variance change-point models with scale mixtures of normal (for short SMN) distributions. This class of distributions is symmetric and thick-tailed and includes as special cases: Gaussian, Student-t, contaminated normal, and slash distributions. The proposed models provide greater flexibility to analyze a lot of practical data, which often show heavy-tail and may not satisfy the normal assumption. As to the Bayesian analysis, we specify some prior distributions for the unknown parameters in the variance change-point models with the SMN distributions. Due to the complexity of the joint posterior distribution, we propose an efficient Gibbs-type with Metropolis- Hastings sampling algorithm for posterior Bayesian inference. Thereafter, following the idea of [1], we consider the problems of the single and multiple change-point detections. The performance of the proposed procedures is illustrated and analyzed by simulation studies. A real application to the closing price data of U.S. stock market has been analyzed for illustrative purposes.
Resumo:
The rise of evidence-based medicine as well as important progress in statistical methods and computational power have led to a second birth of the >200-year-old Bayesian framework. The use of Bayesian techniques, in particular in the design and interpretation of clinical trials, offers several substantial advantages over the classical statistical approach. First, in contrast to classical statistics, Bayesian analysis allows a direct statement regarding the probability that a treatment was beneficial. Second, Bayesian statistics allow the researcher to incorporate any prior information in the analysis of the experimental results. Third, Bayesian methods can efficiently handle complex statistical models, which are suited for advanced clinical trial designs. Finally, Bayesian statistics encourage a thorough consideration and presentation of the assumptions underlying an analysis, which enables the reader to fully appraise the authors' conclusions. Both Bayesian and classical statistics have their respective strengths and limitations and should be viewed as being complementary to each other; we do not attempt to make a head-to-head comparison, as this is beyond the scope of the present review. Rather, the objective of the present article is to provide a nonmathematical, reader-friendly overview of the current practice of Bayesian statistics coupled with numerous intuitive examples from the field of oncology. It is hoped that this educational review will be a useful resource to the oncologist and result in a better understanding of the scope, strengths, and limitations of the Bayesian approach.
Resumo:
OBJECTIVE: Hierarchical modeling has been proposed as a solution to the multiple exposure problem. We estimate associations between metabolic syndrome and different components of antiretroviral therapy using both conventional and hierarchical models. STUDY DESIGN AND SETTING: We use discrete time survival analysis to estimate the association between metabolic syndrome and cumulative exposure to 16 antiretrovirals from four drug classes. We fit a hierarchical model where the drug class provides a prior model of the association between metabolic syndrome and exposure to each antiretroviral. RESULTS: One thousand two hundred and eighteen patients were followed for a median of 27 months, with 242 cases of metabolic syndrome (20%) at a rate of 7.5 cases per 100 patient years. Metabolic syndrome was more likely to develop in patients exposed to stavudine, but was less likely to develop in those exposed to atazanavir. The estimate for exposure to atazanavir increased from hazard ratio of 0.06 per 6 months' use in the conventional model to 0.37 in the hierarchical model (or from 0.57 to 0.81 when using spline-based covariate adjustment). CONCLUSION: These results are consistent with trials that show the disadvantage of stavudine and advantage of atazanavir relative to other drugs in their respective classes. The hierarchical model gave more plausible results than the equivalent conventional model.