8 resultados para Linear discriminant function

em Collection Of Biostatistics Research Archive


Relevância:

80.00% 80.00%

Publicador:

Resumo:

High-throughput gene expression technologies such as microarrays have been utilized in a variety of scientific applications. Most of the work has been on assessing univariate associations between gene expression with clinical outcome (variable selection) or on developing classification procedures with gene expression data (supervised learning). We consider a hybrid variable selection/classification approach that is based on linear combinations of the gene expression profiles that maximize an accuracy measure summarized using the receiver operating characteristic curve. Under a specific probability model, this leads to consideration of linear discriminant functions. We incorporate an automated variable selection approach using LASSO. An equivalence between LASSO estimation with support vector machines allows for model fitting using standard software. We apply the proposed method to simulated data as well as data from a recently published prostate cancer study.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Despite the widespread popularity of linear models for correlated outcomes (e.g. linear mixed modesl and time series models), distribution diagnostic methodology remains relatively underdeveloped in this context. In this paper we present an easy-to-implement approach that lends itself to graphical displays of model fit. Our approach involves multiplying the estimated marginal residual vector by the Cholesky decomposition of the inverse of the estimated marginal variance matrix. Linear functions or the resulting "rotated" residuals are used to construct an empirical cumulative distribution function (ECDF), whose stochastic limit is characterized. We describe a resampling technique that serves as a computationally efficient parametric bootstrap for generating representatives of the stochastic limit of the ECDF. Through functionals, such representatives are used to construct global tests for the hypothesis of normal margional errors. In addition, we demonstrate that the ECDF of the predicted random effects, as described by Lange and Ryan (1989), can be formulated as a special case of our approach. Thus, our method supports both omnibus and directed tests. Our method works well in a variety of circumstances, including models having independent units of sampling (clustered data) and models for which all observations are correlated (e.g., a single time series).

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper we propose methods for smooth hazard estimation of a time variable where that variable is interval censored. These methods allow one to model the transformed hazard in terms of either smooth (smoothing splines) or linear functions of time and other relevant time varying predictor variables. We illustrate the use of this method on a dataset of hemophiliacs where the outcome, time to seroconversion for HIV, is interval censored and left-truncated.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Despite the widespread popularity of linear models for correlated outcomes (e.g. linear mixed models and time series models), distribution diagnostic methodology remains relatively underdeveloped in this context. In this paper we present an easy-to-implement approach that lends itself to graphical displays of model fit. Our approach involves multiplying the estimated margional residual vector by the Cholesky decomposition of the inverse of the estimated margional variance matrix. The resulting "rotated" residuals are used to construct an empirical cumulative distribution function and pointwise standard errors. The theoretical framework, including conditions and asymptotic properties, involves technical details that are motivated by Lange and Ryan (1989), Pierce (1982), and Randles (1982). Our method appears to work well in a variety of circumstances, including models having independent units of sampling (clustered data) and models for which all observations are correlated (e.g., a single time series). Our methods can produce satisfactory results even for models that do not satisfy all of the technical conditions stated in our theory.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Generalized linear mixed models with semiparametric random effects are useful in a wide variety of Bayesian applications. When the random effects arise from a mixture of Dirichlet process (MDP) model, normal base measures and Gibbs sampling procedures based on the Pólya urn scheme are often used to simulate posterior draws. These algorithms are applicable in the conjugate case when (for a normal base measure) the likelihood is normal. In the non-conjugate case, the algorithms proposed by MacEachern and Müller (1998) and Neal (2000) are often applied to generate posterior samples. Some common problems associated with simulation algorithms for non-conjugate MDP models include convergence and mixing difficulties. This paper proposes an algorithm based on the Pólya urn scheme that extends the Gibbs sampling algorithms to non-conjugate models with normal base measures and exponential family likelihoods. The algorithm proceeds by making Laplace approximations to the likelihood function, thereby reducing the procedure to that of conjugate normal MDP models. To ensure the validity of the stationary distribution in the non-conjugate case, the proposals are accepted or rejected by a Metropolis-Hastings step. In the special case where the data are normally distributed, the algorithm is identical to the Gibbs sampler.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Various inference procedures for linear regression models with censored failure times have been studied extensively. Recent developments on efficient algorithms to implement these procedures enhance the practical usage of such models in survival analysis. In this article, we present robust inferences for certain covariate effects on the failure time in the presence of "nuisance" confounders under a semiparametric, partial linear regression setting. Specifically, the estimation procedures for the regression coefficients of interest are derived from a working linear model and are valid even when the function of the confounders in the model is not correctly specified. The new proposals are illustrated with two examples and their validity for cases with practical sample sizes is demonstrated via a simulation study.