4 resultados para semi-parametric models

em DigitalCommons@The Texas Medical Center


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Prevalent sampling is an efficient and focused approach to the study of the natural history of disease. Right-censored time-to-event data observed from prospective prevalent cohort studies are often subject to left-truncated sampling. Left-truncated samples are not randomly selected from the population of interest and have a selection bias. Extensive studies have focused on estimating the unbiased distribution given left-truncated samples. However, in many applications, the exact date of disease onset was not observed. For example, in an HIV infection study, the exact HIV infection time is not observable. However, it is known that the HIV infection date occurred between two observable dates. Meeting these challenges motivated our study. We propose parametric models to estimate the unbiased distribution of left-truncated, right-censored time-to-event data with uncertain onset times. We first consider data from a length-biased sampling, a specific case in left-truncated samplings. Then we extend the proposed method to general left-truncated sampling. With a parametric model, we construct the full likelihood, given a biased sample with unobservable onset of disease. The parameters are estimated through the maximization of the constructed likelihood by adjusting the selection bias and unobservable exact onset. Simulations are conducted to evaluate the finite sample performance of the proposed methods. We apply the proposed method to an HIV infection study, estimating the unbiased survival function and covariance coefficients. ^

Relevância:

80.00% 80.00%

Publicador:

Resumo:

An extension of k-ratio multiple comparison methods to rank-based analyses is described. The new method is analogous to the Duncan-Godbold approximate k-ratio procedure for unequal sample sizes or correlated means. The close parallel of the new methods to the Duncan-Godbold approach is shown by demonstrating that they are based upon different parameterizations as starting points.^ A semi-parametric basis for the new methods is shown by starting from the Cox proportional hazards model, using Wald statistics. From there the log-rank and Gehan-Breslow-Wilcoxon methods may be seen as score statistic based methods.^ Simulations and analysis of a published data set are used to show the performance of the new methods. ^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Strategies are compared for the development of a linear regression model with stochastic (multivariate normal) regressor variables and the subsequent assessment of its predictive ability. Bias and mean squared error of four estimators of predictive performance are evaluated in simulated samples of 32 population correlation matrices. Models including all of the available predictors are compared with those obtained using selected subsets. The subset selection procedures investigated include two stopping rules, C$\sb{\rm p}$ and S$\sb{\rm p}$, each combined with an 'all possible subsets' or 'forward selection' of variables. The estimators of performance utilized include parametric (MSEP$\sb{\rm m}$) and non-parametric (PRESS) assessments in the entire sample, and two data splitting estimates restricted to a random or balanced (Snee's DUPLEX) 'validation' half sample. The simulations were performed as a designed experiment, with population correlation matrices representing a broad range of data structures.^ The techniques examined for subset selection do not generally result in improved predictions relative to the full model. Approaches using 'forward selection' result in slightly smaller prediction errors and less biased estimators of predictive accuracy than 'all possible subsets' approaches but no differences are detected between the performances of C$\sb{\rm p}$ and S$\sb{\rm p}$. In every case, prediction errors of models obtained by subset selection in either of the half splits exceed those obtained using all predictors and the entire sample.^ Only the random split estimator is conditionally (on $\\beta$) unbiased, however MSEP$\sb{\rm m}$ is unbiased on average and PRESS is nearly so in unselected (fixed form) models. When subset selection techniques are used, MSEP$\sb{\rm m}$ and PRESS always underestimate prediction errors, by as much as 27 percent (on average) in small samples. Despite their bias, the mean squared errors (MSE) of these estimators are at least 30 percent less than that of the unbiased random split estimator. The DUPLEX split estimator suffers from large MSE as well as bias, and seems of little value within the context of stochastic regressor variables.^ To maximize predictive accuracy while retaining a reliable estimate of that accuracy, it is recommended that the entire sample be used for model development, and a leave-one-out statistic (e.g. PRESS) be used for assessment. ^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In regression analysis, covariate measurement error occurs in many applications. The error-prone covariates are often referred to as latent variables. In this proposed study, we extended the study of Chan et al. (2008) on recovering latent slope in a simple regression model to that in a multiple regression model. We presented an approach that applied the Monte Carlo method in the Bayesian framework to the parametric regression model with the measurement error in an explanatory variable. The proposed estimator applied the conditional expectation of latent slope given the observed outcome and surrogate variables in the multiple regression models. A simulation study was presented showing that the method produces estimator that is efficient in the multiple regression model, especially when the measurement error variance of surrogate variable is large.^