22 resultados para SMOOTHING SPLINE

em Collection Of Biostatistics Research Archive


Relevância:

70.00% 70.00%

Publicador:

Resumo:

Smoothing splines are a popular approach for non-parametric regression problems. We use periodic smoothing splines to fit a periodic signal plus noise model to data for which we assume there are underlying circadian patterns. In the smoothing spline methodology, choosing an appropriate smoothness parameter is an important step in practice. In this paper, we draw a connection between smoothing splines and REACT estimators that provides motivation for the creation of criteria for choosing the smoothness parameter. The new criteria are compared to three existing methods, namely cross-validation, generalized cross-validation, and generalization of maximum likelihood criteria, by a Monte Carlo simulation and by an application to the study of circadian patterns. For most of the situations presented in the simulations, including the practical example, the new criteria out-perform the three existing criteria.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

When different markers are responsive to different aspects of a disease, combination of multiple markers could provide a better screening test for early detection. It is also resonable to assume that the risk of disease changes smoothly as the biomarker values change and the change in risk is monotone with respect to each biomarker. In this paper, we propose a boundary constrained tensor-product B-spline method to estimate the risk of disease by maximizing a penalized likelihood. To choose the optimal amount of smoothing, two scores are proposed which are extensions of the GCV score (O'Sullivan et al. (1986)) and the GACV score (Ziang and Wahba (1996)) to incorporate linear constraints. Simulation studies are carried out to investigate the performance of the proposed estimator and the selection scores. In addidtion, sensitivities and specificities based ona pproximate leave-one-out estimates are proposed to generate more realisitc ROC curves. Data from a pancreatic cancer study is used for illustration.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper proposes a numerically simple routine for locally adaptive smoothing. The locally heterogeneous regression function is modelled as a penalized spline with a smoothly varying smoothing parameter modelled as another penalized spline. This is being formulated as hierarchical mixed model, with spline coe±cients following a normal distribution, which by itself has a smooth structure over the variances. The modelling exercise is in line with Baladandayuthapani, Mallick & Carroll (2005) or Crainiceanu, Ruppert & Carroll (2006). But in contrast to these papers Laplace's method is used for estimation based on the marginal likelihood. This is numerically simple and fast and provides satisfactory results quickly. We also extend the idea to spatial smoothing and smoothing in the presence of non normal response.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Numerous time series studies have provided strong evidence of an association between increased levels of ambient air pollution and increased levels of hospital admissions, typically at 0, 1, or 2 days after an air pollution episode. An important research aim is to extend existing statistical models so that a more detailed understanding of the time course of hospitalization after exposure to air pollution can be obtained. Information about this time course, combined with prior knowledge about biological mechanisms, could provide the basis for hypotheses concerning the mechanism by which air pollution causes disease. Previous studies have identified two important methodological questions: (1) How can we estimate the shape of the distributed lag between increased air pollution exposure and increased mortality or morbidity? and (2) How should we estimate the cumulative population health risk from short-term exposure to air pollution? Distributed lag models are appropriate tools for estimating air pollution health effects that may be spread over several days. However, estimation for distributed lag models in air pollution and health applications is hampered by the substantial noise in the data and the inherently weak signal that is the target of investigation. We introduce an hierarchical Bayesian distributed lag model that incorporates prior information about the time course of pollution effects and combines information across multiple locations. The model has a connection to penalized spline smoothing using a special type of penalty matrix. We apply the model to estimating the distributed lag between exposure to particulate matter air pollution and hospitalization for cardiovascular and respiratory disease using data from a large United States air pollution and hospitalization database of Medicare enrollees in 94 counties covering the years 1999-2002.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

We investigate the interplay of smoothness and monotonicity assumptions when estimating a density from a sample of observations. The nonparametric maximum likelihood estimator of a decreasing density on the positive half line attains a rate of convergence at a fixed point if the density has a negative derivative. The same rate is obtained by a kernel estimator, but the limit distributions are different. If the density is both differentiable and known to be monotone, then a third estimator is obtained by isotonization of a kernel estimator. We show that this again attains the rate of convergence and compare the limit distributors of the three types of estimators. It is shown that both isotonization and smoothing lead to a more concentrated limit distribution and we study the dependence on the proportionality constant in the bandwidth. We also show that isotonization does not change the limit behavior of a kernel estimator with a larger bandwidth, in the case that the density is known to have more than one derivative.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In this paper we propose methods for smooth hazard estimation of a time variable where that variable is interval censored. These methods allow one to model the transformed hazard in terms of either smooth (smoothing splines) or linear functions of time and other relevant time varying predictor variables. We illustrate the use of this method on a dataset of hemophiliacs where the outcome, time to seroconversion for HIV, is interval censored and left-truncated.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Boston Harbor has had a history of poor water quality, including contamination by enteric pathogens. We conduct a statistical analysis of data collected by the Massachusetts Water Resources Authority (MWRA) between 1996 and 2002 to evaluate the effects of court-mandated improvements in sewage treatment. Motivated by the ineffectiveness of standard Poisson mixture models and their zero-inflated counterparts, we propose a new negative binomial model for time series of Enterococcus counts in Boston Harbor, where nonstationarity and autocorrelation are modeled using a nonparametric smooth function of time in the predictor. Without further restrictions, this function is not identifiable in the presence of time-dependent covariates; consequently we use a basis orthogonal to the space spanned by the covariates and use penalized quasi-likelihood (PQL) for estimation. We conclude that Enterococcus counts were greatly reduced near the Nut Island Treatment Plant (NITP) outfalls following the transfer of wastewaters from NITP to the Deer Island Treatment Plant (DITP) and that the transfer of wastewaters from Boston Harbor to the offshore diffusers in Massachusetts Bay reduced the Enterococcus counts near the DITP outfalls.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

We propose a new method for fitting proportional hazards models with error-prone covariates. Regression coefficients are estimated by solving an estimating equation that is the average of the partial likelihood scores based on imputed true covariates. For the purpose of imputation, a linear spline model is assumed on the baseline hazard. We discuss consistency and asymptotic normality of the resulting estimators, and propose a stochastic approximation scheme to obtain the estimates. The algorithm is easy to implement, and reduces to the ordinary Cox partial likelihood approach when the measurement error has a degenerative distribution. Simulations indicate high efficiency and robustness. We consider the special case where error-prone replicates are available on the unobserved true covariates. As expected, increasing the number of replicate for the unobserved covariates increases efficiency and reduces bias. We illustrate the practical utility of the proposed method with an Eastern Cooperative Oncology Group clinical trial where a genetic marker, c-myc expression level, is subject to measurement error.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Traffic particle concentrations show considerable spatial variability within a metropolitan area. We consider latent variable semiparametric regression models for modeling the spatial and temporal variability of black carbon and elemental carbon concentrations in the greater Boston area. Measurements of these pollutants, which are markers of traffic particles, were obtained from several individual exposure studies conducted at specific household locations as well as 15 ambient monitoring sites in the city. The models allow for both flexible, nonlinear effects of covariates and for unexplained spatial and temporal variability in exposure. In addition, the different individual exposure studies recorded different surrogates of traffic particles, with some recording only outdoor concentrations of black or elemental carbon, some recording indoor concentrations of black carbon, and others recording both indoor and outdoor concentrations of black carbon. A joint model for outdoor and indoor exposure that specifies a spatially varying latent variable provides greater spatial coverage in the area of interest. We propose a penalised spline formation of the model that relates to generalised kringing of the latent traffic pollution variable and leads to a natural Bayesian Markov Chain Monte Carlo algorithm for model fitting. We propose methods that allow us to control the degress of freedom of the smoother in a Bayesian framework. Finally, we present results from an analysis that applies the model to data from summer and winter separately