3 resultados para Stepwise Discriminant Analysis

em Duke University


Relevância:

100.00% 100.00%

Publicador:

Resumo:

We study the problem of supervised linear dimensionality reduction, taking an information-theoretic viewpoint. The linear projection matrix is designed by maximizing the mutual information between the projected signal and the class label. By harnessing a recent theoretical result on the gradient of mutual information, the above optimization problem can be solved directly using gradient descent, without requiring simplification of the objective function. Theoretical analysis and empirical comparison are made between the proposed method and two closely related methods, and comparisons are also made with a method in which Rényi entropy is used to define the mutual information (in this case the gradient may be computed simply, under a special parameter setting). Relative to these alternative approaches, the proposed method achieves promising results on real datasets. Copyright 2012 by the author(s)/owner(s).

Relevância:

80.00% 80.00%

Publicador:

Resumo:

As more diagnostic testing options become available to physicians, it becomes more difficult to combine various types of medical information together in order to optimize the overall diagnosis. To improve diagnostic performance, here we introduce an approach to optimize a decision-fusion technique to combine heterogeneous information, such as from different modalities, feature categories, or institutions. For classifier comparison we used two performance metrics: The receiving operator characteristic (ROC) area under the curve [area under the ROC curve (AUC)] and the normalized partial area under the curve (pAUC). This study used four classifiers: Linear discriminant analysis (LDA), artificial neural network (ANN), and two variants of our decision-fusion technique, AUC-optimized (DF-A) and pAUC-optimized (DF-P) decision fusion. We applied each of these classifiers with 100-fold cross-validation to two heterogeneous breast cancer data sets: One of mass lesion features and a much more challenging one of microcalcification lesion features. For the calcification data set, DF-A outperformed the other classifiers in terms of AUC (p < 0.02) and achieved AUC=0.85 +/- 0.01. The DF-P surpassed the other classifiers in terms of pAUC (p < 0.01) and reached pAUC=0.38 +/- 0.02. For the mass data set, DF-A outperformed both the ANN and the LDA (p < 0.04) and achieved AUC=0.94 +/- 0.01. Although for this data set there were no statistically significant differences among the classifiers' pAUC values (pAUC=0.57 +/- 0.07 to 0.67 +/- 0.05, p > 0.10), the DF-P did significantly improve specificity versus the LDA at both 98% and 100% sensitivity (p < 0.04). In conclusion, decision fusion directly optimized clinically significant performance measures, such as AUC and pAUC, and sometimes outperformed two well-known machine-learning techniques when applied to two different breast cancer data sets.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

A shearing quotient (SQ) is a way of quantitatively representing the Phase I shearing edges on a molar tooth. Ordinary or phylogenetic least squares regression is fit to data on log molar length (independent variable) and log sum of measured shearing crests (dependent variable). The derived linear equation is used to generate an 'expected' shearing crest length from molar length of included individuals or taxa. Following conversion of all variables to real space, the expected value is subtracted from the observed value for each individual or taxon. The result is then divided by the expected value and multiplied by 100. SQs have long been the metric of choice for assessing dietary adaptations in fossil primates. Not all studies using SQ have used the same tooth position or crests, nor have all computed regression equations using the same approach. Here we focus on re-analyzing the data of one recent study to investigate the magnitude of effects of variation in 1) shearing crest inclusion, and 2) details of the regression setup. We assess the significance of these effects by the degree to which they improve or degrade the association between computed SQs and diet categories. Though altering regression parameters for SQ calculation has a visible effect on plots, numerous iterations of statistical analyses vary surprisingly little in the success of the resulting variables for assigning taxa to dietary preference. This is promising for the comparability of patterns (if not casewise values) in SQ between studies. We suggest that differences in apparent dietary fidelity of recent studies are attributable principally to tooth position examined.