925 resultados para Testicular regression
Resumo:
Increasingly, regression models are used when residuals are spatially correlated. Prominent examples include studies in environmental epidemiology to understand the chronic health effects of pollutants. I consider the effects of residual spatial structure on the bias and precision of regression coefficients, developing a simple framework in which to understand the key issues and derive informative analytic results. When the spatial residual is induced by an unmeasured confounder, regression models with spatial random effects and closely-related models such as kriging and penalized splines are biased, even when the residual variance components are known. Analytic and simulation results show how the bias depends on the spatial scales of the covariate and the residual; bias is reduced only when there is variation in the covariate at a scale smaller than the scale of the unmeasured confounding. I also discuss how the scales of the residual and the covariate affect efficiency and uncertainty estimation when the residuals can be considered independent of the covariate. In an application on the association between black carbon particulate matter air pollution and birth weight, controlling for large-scale spatial variation appears to reduce bias from unmeasured confounders, while increasing uncertainty in the estimated pollution effect.
Resumo:
Various inference procedures for linear regression models with censored failure times have been studied extensively. Recent developments on efficient algorithms to implement these procedures enhance the practical usage of such models in survival analysis. In this article, we present robust inferences for certain covariate effects on the failure time in the presence of "nuisance" confounders under a semiparametric, partial linear regression setting. Specifically, the estimation procedures for the regression coefficients of interest are derived from a working linear model and are valid even when the function of the confounders in the model is not correctly specified. The new proposals are illustrated with two examples and their validity for cases with practical sample sizes is demonstrated via a simulation study.
Resumo:
In epidemiological work, outcomes are frequently non-normal, sample sizes may be large, and effects are often small. To relate health outcomes to geographic risk factors, fast and powerful methods for fitting spatial models, particularly for non-normal data, are required. We focus on binary outcomes, with the risk surface a smooth function of space. We compare penalized likelihood models, including the penalized quasi-likelihood (PQL) approach, and Bayesian models based on fit, speed, and ease of implementation. A Bayesian model using a spectral basis representation of the spatial surface provides the best tradeoff of sensitivity and specificity in simulations, detecting real spatial features while limiting overfitting and being more efficient computationally than other Bayesian approaches. One of the contributions of this work is further development of this underused representation. The spectral basis model outperforms the penalized likelihood methods, which are prone to overfitting, but is slower to fit and not as easily implemented. Conclusions based on a real dataset of cancer cases in Taiwan are similar albeit less conclusive with respect to comparing the approaches. The success of the spectral basis with binary data and similar results with count data suggest that it may be generally useful in spatial models and more complicated hierarchical models.
Resumo:
The Receiver Operating Characteristic (ROC) curve is a prominent tool for characterizing the accuracy of continuous diagnostic test. To account for factors that might invluence the test accuracy, various ROC regression methods have been proposed. However, as in any regression analysis, when the assumed models do not fit the data well, these methods may render invalid and misleading results. To date practical model checking techniques suitable for validating existing ROC regression models are not yet available. In this paper, we develop cumulative residual based procedures to graphically and numerically assess the goodness-of-fit for some commonly used ROC regression models, and show how specific components of these models can be examined within this framework. We derive asymptotic null distributions for the residual process and discuss resampling procedures to approximate these distributions in practice. We illustrate our methods with a dataset from the Cystic Fibrosis registry.
Resumo:
Suppose that we are interested in establishing simple, but reliable rules for predicting future t-year survivors via censored regression models. In this article, we present inference procedures for evaluating such binary classification rules based on various prediction precision measures quantified by the overall misclassification rate, sensitivity and specificity, and positive and negative predictive values. Specifically, under various working models we derive consistent estimators for the above measures via substitution and cross validation estimation procedures. Furthermore, we provide large sample approximations to the distributions of these nonsmooth estimators without assuming that the working model is correctly specified. Confidence intervals, for example, for the difference of the precision measures between two competing rules can then be constructed. All the proposals are illustrated with two real examples and their finite sample properties are evaluated via a simulation study.
Resumo:
Traffic particle concentrations show considerable spatial variability within a metropolitan area. We consider latent variable semiparametric regression models for modeling the spatial and temporal variability of black carbon and elemental carbon concentrations in the greater Boston area. Measurements of these pollutants, which are markers of traffic particles, were obtained from several individual exposure studies conducted at specific household locations as well as 15 ambient monitoring sites in the city. The models allow for both flexible, nonlinear effects of covariates and for unexplained spatial and temporal variability in exposure. In addition, the different individual exposure studies recorded different surrogates of traffic particles, with some recording only outdoor concentrations of black or elemental carbon, some recording indoor concentrations of black carbon, and others recording both indoor and outdoor concentrations of black carbon. A joint model for outdoor and indoor exposure that specifies a spatially varying latent variable provides greater spatial coverage in the area of interest. We propose a penalised spline formation of the model that relates to generalised kringing of the latent traffic pollution variable and leads to a natural Bayesian Markov Chain Monte Carlo algorithm for model fitting. We propose methods that allow us to control the degress of freedom of the smoother in a Bayesian framework. Finally, we present results from an analysis that applies the model to data from summer and winter separately
Resumo:
In environmental epidemiology, exposure X and health outcome Y vary in space and time. We present a method to diagnose the possible influence of unmeasured confounders U on the estimated effect of X on Y and to propose several approaches to robust estimation. The idea is to use space and time as proxy measures for the unmeasured factors U. We start with the time series case where X and Y are continuous variables at equally-spaced times and assume a linear model. We define matching estimator b(u)s that correspond to pairs of observations with specific lag u. Controlling for a smooth function of time, St, using a kernel estimator is roughly equivalent to estimating the association with a linear combination of the b(u)s with weights that involve two components: the assumptions about the smoothness of St and the normalized variogram of the X process. When an unmeasured confounder U exists, but the model otherwise correctly controls for measured confounders, the excess variation in b(u)s is evidence of confounding by U. We use the plot of b(u)s versus lag u, lagged-estimator-plot (LEP), to diagnose the influence of U on the effect of X on Y. We use appropriate linear combination of b(u)s or extrapolate to b(0) to obtain novel estimators that are more robust to the influence of smooth U. The methods are extended to time series log-linear models and to spatial analyses. The LEP plot gives us a direct view of the magnitude of the estimators for each lag u and provides evidence when models did not adequately describe the data.
Resumo:
Latent class regression models are useful tools for assessing associations between covariates and latent variables. However, evaluation of key model assumptions cannot be performed using methods from standard regression models due to the unobserved nature of latent outcome variables. This paper presents graphical diagnostic tools to evaluate whether or not latent class regression models adhere to standard assumptions of the model: conditional independence and non-differential measurement. An integral part of these methods is the use of a Markov Chain Monte Carlo estimation procedure. Unlike standard maximum likelihood implementations for latent class regression model estimation, the MCMC approach allows us to calculate posterior distributions and point estimates of any functions of parameters. It is this convenience that allows us to provide the diagnostic methods that we introduce. As a motivating example we present an analysis focusing on the association between depression and socioeconomic status, using data from the Epidemiologic Catchment Area study. We consider a latent class regression analysis investigating the association between depression and socioeconomic status measures, where the latent variable depression is regressed on education and income indicators, in addition to age, gender, and marital status variables. While the fitted latent class regression model yields interesting results, the model parameters are found to be invalid due to the violation of model assumptions. The violation of these assumptions is clearly identified by the presented diagnostic plots. These methods can be applied to standard latent class and latent class regression models, and the general principle can be extended to evaluate model assumptions in other types of models.