5 resultados para Receiver operating characterictics
em Collection Of Biostatistics Research Archive
Resumo:
High-throughput gene expression technologies such as microarrays have been utilized in a variety of scientific applications. Most of the work has been on assessing univariate associations between gene expression with clinical outcome (variable selection) or on developing classification procedures with gene expression data (supervised learning). We consider a hybrid variable selection/classification approach that is based on linear combinations of the gene expression profiles that maximize an accuracy measure summarized using the receiver operating characteristic curve. Under a specific probability model, this leads to consideration of linear discriminant functions. We incorporate an automated variable selection approach using LASSO. An equivalence between LASSO estimation with support vector machines allows for model fitting using standard software. We apply the proposed method to simulated data as well as data from a recently published prostate cancer study.
Resumo:
A marker that is strongly associated with outcome (or disease) is often assumed to be effective for classifying individuals according to their current or future outcome. However, for this to be true, the associated odds ratio must be of a magnitude rarely seen in epidemiological studies. An illustration of the relationship between odds ratios and receiver operating characteristic (ROC) curves shows, for example, that a marker with an odds ratio as high as 3 is in fact a very poor classification tool. If a marker identifies 10 percent of controls as positive (false positives) and has an odds ratio of 3, then it will only correctly identify 25 percent of cases as positive (true positives). Moreover, the authors illustrate that a single measure of association such as an odds ratio does not meaningfully describe a marker’s ability to classify subjects. Appropriate statistical methods for assessing and reporting the classification power of a marker are described. The serious pitfalls of using more traditional methods based on parameters in logistic regression models are illustrated.
Resumo:
The Receiver Operating Characteristic (ROC) curve is a prominent tool for characterizing the accuracy of continuous diagnostic test. To account for factors that might invluence the test accuracy, various ROC regression methods have been proposed. However, as in any regression analysis, when the assumed models do not fit the data well, these methods may render invalid and misleading results. To date practical model checking techniques suitable for validating existing ROC regression models are not yet available. In this paper, we develop cumulative residual based procedures to graphically and numerically assess the goodness-of-fit for some commonly used ROC regression models, and show how specific components of these models can be examined within this framework. We derive asymptotic null distributions for the residual process and discuss resampling procedures to approximate these distributions in practice. We illustrate our methods with a dataset from the Cystic Fibrosis registry.