Classification and selection of biomarkers in genomic data using LASSO


Autoria(s): Ghosh, Debashis; Chinnaiyan, Arul
Data(s)

08/06/2004

Resumo

High-throughput gene expression technologies such as microarrays have been utilized in a variety of scientific applications. Most of the work has been on assessing univariate associations between gene expression with clinical outcome (variable selection) or on developing classification procedures with gene expression data (supervised learning). We consider a hybrid variable selection/classification approach that is based on linear combinations of the gene expression profiles that maximize an accuracy measure summarized using the receiver operating characteristic curve. Under a specific probability model, this leads to consideration of linear discriminant functions. We incorporate an automated variable selection approach using LASSO. An equivalence between LASSO estimation with support vector machines allows for model fitting using standard software. We apply the proposed method to simulated data as well as data from a recently published prostate cancer study.

Formato

application/pdf

Identificador

http://biostats.bepress.com/umichbiostat/paper42

http://biostats.bepress.com/cgi/viewcontent.cgi?article=1041&context=umichbiostat

Publicador

Collection of Biostatistics Research Archive

Fonte

The University of Michigan Department of Biostatistics Working Paper Series

Tipo

text