8 resultados para Asymptotic behaviour, Bayesian methods, Mixture models, Overfitting, Posterior concentration
em Collection Of Biostatistics Research Archive
Resumo:
In many clinical trials to evaluate treatment efficacy, it is believed that there may exist latent treatment effectiveness lag times after which medical procedure or chemical compound would be in full effect. In this article, semiparametric regression models are proposed and studied to estimate the treatment effect accounting for such latent lag times. The new models take advantage of the invariance property of the additive hazards model in marginalizing over random effects, so parameters in the models are easy to be estimated and interpreted, while the flexibility without specifying baseline hazard function is kept. Monte Carlo simulation studies demonstrate the appropriateness of the proposed semiparametric estimation procedure. Data collected in the actual randomized clinical trial, which evaluates the effectiveness of biodegradable carmustine polymers for treatment of recurrent brain tumors, are analyzed.
Resumo:
Boston Harbor has had a history of poor water quality, including contamination by enteric pathogens. We conduct a statistical analysis of data collected by the Massachusetts Water Resources Authority (MWRA) between 1996 and 2002 to evaluate the effects of court-mandated improvements in sewage treatment. Motivated by the ineffectiveness of standard Poisson mixture models and their zero-inflated counterparts, we propose a new negative binomial model for time series of Enterococcus counts in Boston Harbor, where nonstationarity and autocorrelation are modeled using a nonparametric smooth function of time in the predictor. Without further restrictions, this function is not identifiable in the presence of time-dependent covariates; consequently we use a basis orthogonal to the space spanned by the covariates and use penalized quasi-likelihood (PQL) for estimation. We conclude that Enterococcus counts were greatly reduced near the Nut Island Treatment Plant (NITP) outfalls following the transfer of wastewaters from NITP to the Deer Island Treatment Plant (DITP) and that the transfer of wastewaters from Boston Harbor to the offshore diffusers in Massachusetts Bay reduced the Enterococcus counts near the DITP outfalls.
Resumo:
In this paper, we develop Bayesian hierarchical distributed lag models for estimating associations between daily variations in summer ozone levels and daily variations in cardiovascular and respiratory (CVDRESP) mortality counts for 19 U.S. large cities included in the National Morbidity Mortality Air Pollution Study (NMMAPS) for the period 1987 - 1994. At the first stage, we define a semi-parametric distributed lag Poisson regression model to estimate city-specific relative rates of CVDRESP associated with short-term exposure to summer ozone. At the second stage, we specify a class of distributions for the true city-specific relative rates to estimate an overall effect by taking into account the variability within and across cities. We perform the calculations with respect to several random effects distributions (normal, t-student, and mixture of normal), thus relaxing the common assumption of a two-stage normal-normal hierarchical model. We assess the sensitivity of the results to: 1) lag structure for ozone exposure; 2) degree of adjustment for long-term trends; 3) inclusion of other pollutants in the model;4) heat waves; 5) random effects distributions; and 6) prior hyperparameters. On average across cities, we found that a 10ppb increase in summer ozone level for every day in the previous week is associated with 1.25 percent increase in CVDRESP mortality (95% posterior regions: 0.47, 2.03). The relative rate estimates are also positive and statistically significant at lags 0, 1, and 2. We found that associations between summer ozone and CVDRESP mortality are sensitive to the confounding adjustment for PM_10, but are robust to: 1) the adjustment for long-term trends, other gaseous pollutants (NO_2, SO_2, and CO); 2) the distributional assumptions at the second stage of the hierarchical model; and 3) the prior distributions on all unknown parameters. Bayesian hierarchical distributed lag models and their application to the NMMAPS data allow us estimation of an acute health effect associated with exposure to ambient air pollution in the last few days on average across several locations. The application of these methods and the systematic assessment of the sensitivity of findings to model assumptions provide important epidemiological evidence for future air quality regulations.
Resumo:
Generalized linear mixed models with semiparametric random effects are useful in a wide variety of Bayesian applications. When the random effects arise from a mixture of Dirichlet process (MDP) model, normal base measures and Gibbs sampling procedures based on the Pólya urn scheme are often used to simulate posterior draws. These algorithms are applicable in the conjugate case when (for a normal base measure) the likelihood is normal. In the non-conjugate case, the algorithms proposed by MacEachern and Müller (1998) and Neal (2000) are often applied to generate posterior samples. Some common problems associated with simulation algorithms for non-conjugate MDP models include convergence and mixing difficulties. This paper proposes an algorithm based on the Pólya urn scheme that extends the Gibbs sampling algorithms to non-conjugate models with normal base measures and exponential family likelihoods. The algorithm proceeds by making Laplace approximations to the likelihood function, thereby reducing the procedure to that of conjugate normal MDP models. To ensure the validity of the stationary distribution in the non-conjugate case, the proposals are accepted or rejected by a Metropolis-Hastings step. In the special case where the data are normally distributed, the algorithm is identical to the Gibbs sampler.
Resumo:
Latent class regression models are useful tools for assessing associations between covariates and latent variables. However, evaluation of key model assumptions cannot be performed using methods from standard regression models due to the unobserved nature of latent outcome variables. This paper presents graphical diagnostic tools to evaluate whether or not latent class regression models adhere to standard assumptions of the model: conditional independence and non-differential measurement. An integral part of these methods is the use of a Markov Chain Monte Carlo estimation procedure. Unlike standard maximum likelihood implementations for latent class regression model estimation, the MCMC approach allows us to calculate posterior distributions and point estimates of any functions of parameters. It is this convenience that allows us to provide the diagnostic methods that we introduce. As a motivating example we present an analysis focusing on the association between depression and socioeconomic status, using data from the Epidemiologic Catchment Area study. We consider a latent class regression analysis investigating the association between depression and socioeconomic status measures, where the latent variable depression is regressed on education and income indicators, in addition to age, gender, and marital status variables. While the fitted latent class regression model yields interesting results, the model parameters are found to be invalid due to the violation of model assumptions. The violation of these assumptions is clearly identified by the presented diagnostic plots. These methods can be applied to standard latent class and latent class regression models, and the general principle can be extended to evaluate model assumptions in other types of models.
Resumo:
Quantifying the health effects associated with simultaneous exposure to many air pollutants is now a research priority of the US EPA. Bayesian hierarchical models (BHM) have been extensively used in multisite time series studies of air pollution and health to estimate health effects of a single pollutant adjusted for potential confounding of other pollutants and other time-varying factors. However, when the scientific goal is to estimate the impacts of many pollutants jointly, a straightforward application of BHM is challenged by the need to specify a random-effect distribution on a high-dimensional vector of nuisance parameters, which often do not have an easy interpretation. In this paper we introduce a new BHM formulation, which we call "reduced BHM", aimed at analyzing clustered data sets in the presence of a large number of random effects that are not of primary scientific interest. At the first stage of the reduced BHM, we calculate the integrated likelihood of the parameter of interest (e.g. excess number of deaths attributed to simultaneous exposure to high levels of many pollutants). At the second stage, we specify a flexible random-effect distribution directly on the parameter of interest. The reduced BHM overcomes many of the challenges in the specification and implementation of full BHM in the context of a large number of nuisance parameters. In simulation studies we show that the reduced BHM performs comparably to the full BHM in many scenarios, and even performs better in some cases. Methods are applied to estimate location-specific and overall relative risks of cardiovascular hospital admissions associated with simultaneous exposure to elevated levels of particulate matter and ozone in 51 US counties during the period 1999-2005.
Resumo:
Time series models relating short-term changes in air pollution levels to daily mortality counts typically assume that the effects of air pollution on the log relative rate of mortality do not vary with time. However, these short-term effects might plausibly vary by season. Changes in the sources of air pollution and meteorology can result in changes in characteristics of the air pollution mixture across seasons. The authors develop Bayesian semi-parametric hierarchical models for estimating time-varying effects of pollution on mortality in multi-site time series studies. The methods are applied to the updated National Morbidity and Mortality Air Pollution Study database for the period 1987--2000, which includes data for 100 U.S. cities. At the national level, a 10 micro-gram/m3 increase in PM(10) at lag 1 is associated with a 0.15 (95% posterior interval: -0.08, 0.39),0.14 (-0.14, 0.42), 0.36 (0.11, 0.61), and 0.14 (-0.06, 0.34) percent increase in mortality for winter, spring, summer, and fall, respectively. An analysis by geographical regions finds a strong seasonal pattern in the northeast (with a peak in summer) and little seasonal variation in the southern regions of the country. These results provide useful information for understanding particle toxicity and guiding future analyses of particle constituent data.