Biblioteca Digital

911 resultados para Multiple discriminant analysis

Statnote 29:discriminant analysis

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Discriminant analysis (also known as discriminant function analysis or multiple discriminant analysis) is a multivariate statistical method of testing the degree to which two or more populations may overlap with each other. It was devised independently by several statisticians including Fisher, Mahalanobis, and Hotelling ). The technique has several possible applications in Microbiology. First, in a clinical microbiological setting, if two different infectious diseases were defined by a number of clinical and pathological variables, it may be useful to decide which measurements were the most effective at distinguishing between the two diseases. Second, in an environmental microbiological setting, the technique could be used to study the relationships between different populations, e.g., to what extent do the properties of soils in which the bacterium Azotobacter is found differ from those in which it is absent? Third, the method can be used as a multivariate ‘t’ test , i.e., given a number of related measurements on two groups, the analysis can provide a single test of the hypothesis that the two populations have the same means for all the variables studied. This statnote describes one of the most popular applications of discriminant analysis in identifying the descriptive variables that can distinguish between two populations.

Predicting class I major histocompatibility complex (MHC) binders using multivariate statistics:comparison of discriminant analysis and multiple linear regression

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The accurate in silico identification of T-cell epitopes is a critical step in the development of peptide-based vaccines, reagents, and diagnostics. It has a direct impact on the success of subsequent experimental work. Epitopes arise as a consequence of complex proteolytic processing within the cell. Prior to being recognized by T cells, an epitope is presented on the cell surface as a complex with a major histocompatibility complex (MHC) protein. A prerequisite therefore for T-cell recognition is that an epitope is also a good MHC binder. Thus, T-cell epitope prediction overlaps strongly with the prediction of MHC binding. In the present study, we compare discriminant analysis and multiple linear regression as algorithmic engines for the definition of quantitative matrices for binding affinity prediction. We apply these methods to peptides which bind the well-studied human MHC allele HLA-A*0201. A matrix which results from combining results of the two methods proved powerfully predictive under cross-validation. The new matrix was also tested on an external set of 160 binders to HLA-A*0201; it was able to recognize 135 (84%) of them.

The role of discriminant analysis in the refinement of customer satisfaction assessment

Relevância:

100.00% 100.00%

Publicador:

Resumo:

OBJECTIVE: To test discriminant analysis as a method of turning the information of a routine customer satisfaction survey (CSS) into a more accurate decision-making tool. METHODS: A 7-question, 10-multiple choice, self-applied questionnaire was used to study a sample of patients seen in two outpatient care units in Valparaíso, Chile, one of primary care (n=100) and the other of secondary care (n=249). Two cutting points were considered in the dependent variable (final satisfaction score): satisfied versus unsatisfied, and very satisfied versus all others. Results were compared with empirical measures (proportion of satisfied individuals, proportion of unsatisfied individuals and size of the median). RESULTS: The response rate was very high, over 97.0% in both units. A new variable, medical attention, was revealed, as explaining satisfaction at the primary care unit. The proportion of the total variability explained by the model was very high (over 99.4%) in both units, when comparing satisfied with unsatisfied customers. In the analysis of very satisfied versus all other customers, significant relationship was identified only in the case of the primary care unit, which explained a small proportion of the variability (41.9%). CONCLUSIONS: Discriminant analysis identified relationships not revealed by the previous analysis. It provided information about the proportion of the variability explained by the model. It identified non-significant relationships suggested by empirical analysis (e.g. the case of the relation very satisfied versus others in the secondary care unit). It measured the contribution of each independent variable to the explanation of the variation of the dependent one.

Canonical discriminant analysis applied to broiler chicken performance

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The mechanisms involved in the control of growth in chickens are too complex to be explained only under univariate analysis because all related traits are biologically correlated. Therefore, we evaluated broiler chicken performance under a multivariate approach, using the canonical discriminant analysis. A total of 1920 chicks from eight treatments, defined as the combination of four broiler chicken strains (Arbor Acres, AgRoss 308, Cobb 500 and RX) from both sexes, were housed in 48 pens. Average feed intake, average live weight, feed conversion and carcass, breast and leg weights were obtained for days 1 to 42. Canonical discriminant analysis was implemented by SAS((R)) CANDISC procedure and differences between treatments were obtained by the F-test (P < 0.05) over the squared Mahalanobis` distances. Multivariate performance from all treatments could be easily visualised because one graph was obtained from two first canonical variables, which explained 96.49% of total variation, using a SAS((R)) CONELIP macro. A clear distinction between sexes was found, where males were better than females. Also between strains, Arbor Acres, AgRoss 308 and Cobb 500 (commercial) were better than RX (experimental), Evaluation of broiler chicken performance was facilitated by the fact that the six original traits were reduced to only two canonical variables. Average live weight and carcass weight (first canonical variable) were the most important traits to discriminate treatments. The contrast between average feed intake and average live weight plus feed conversion (second canonical variable) were used to classify them. We suggest analysing performance data sets using canonical discriminant analysis.

Discriminant analysis of trace elements in normal, benign and malignant breast tissues measured by total reflection X-ray fluorescence

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this work total reflection X-ray fluorescence spectrometry has been employed to determine trace element concentrations in different human breast tissues (normal, normal adjacent, benign and malignant). A multivariate discriminant analysis of observed levels was performed in order to build a predictive model and perform tissue-type classifications. A total of 83 breast tissue samples were studied. Results showed the presence of Ca, Ti, Fe, Cu and Zn in all analyzed samples. All trace elements, except Ti, were found in higher concentrations in both malignant and benign tissues, when compared to normal tissues and normal adjacent tissues. In addition, the concentration of Fe was higher in malignant tissues than in benign neoplastic tissues. An opposite behavior was observed for Ca, Cu and Zn. Results have shown that discriminant analysis was able to successfully identify differences between trace element distributions from normal and malignant tissues with an overall accuracy of 80% and 65% for independent and paired breast samples respectively, and of 87% for benign and malignant tissues. (C) 2009 Elsevier B.V. All rights reserved.

High-breakdown linear discriminant analysis

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The classification rules of linear discriminant analysis are defined by the true mean vectors and the common covariance matrix of the populations from which the data come. Because these true parameters are generally unknown, they are commonly estimated by the sample mean vector and covariance matrix of the data in a training sample randomly drawn from each population. However, these sample statistics are notoriously susceptible to contamination by outliers, a problem compounded by the fact that the outliers may be invisible to conventional diagnostics. High-breakdown estimation is a procedure designed to remove this cause for concern by producing estimates that are immune to serious distortion by a minority of outliers, regardless of their severity. In this article we motivate and develop a high-breakdown criterion for linear discriminant analysis and give an algorithm for its implementation. The procedure is intended to supplement rather than replace the usual sample-moment methodology of discriminant analysis either by providing indications that the dataset is not seriously affected by outliers (supporting the usual analysis) or by identifying apparently aberrant points and giving resistant estimators that are not affected by them.

Multivariate analysis: Classification and discriminant analysis

Relevância:

100.00% 100.00%

Publicador:

Exploring the knowledge contained in neuroimages: Statistical discriminant analysis and automatic segmentation of the most significant changes

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Objective: The aim of this article is to propose an integrated framework for extracting and describing patterns of disorders from medical images using a combination of linear discriminant analysis and active contour models. Methods: A multivariate statistical methodology was first used to identify the most discriminating hyperplane separating two groups of images (from healthy controls and patients with schizophrenia) contained in the input data. After this, the present work makes explicit the differences found by the multivariate statistical method by subtracting the discriminant models of controls and patients, weighted by the pooled variance between the two groups. A variational level-set technique was used to segment clusters of these differences. We obtain a label of each anatomical change using the Talairach atlas. Results: In this work all the data was analysed simultaneously rather than assuming a priori regions of interest. As a consequence of this, by using active contour models, we were able to obtain regions of interest that were emergent from the data. The results were evaluated using, as gold standard, well-known facts about the neuroanatomical changes related to schizophrenia. Most of the items in the gold standard was covered in our result set. Conclusions: We argue that such investigation provides a suitable framework for characterising the high complexity of magnetic resonance images in schizophrenia as the results obtained indicate a high sensitivity rate with respect to the gold standard. (C) 2010 Elsevier B.V. All rights reserved.

Cities in Worldwide Air and Sea Flows : A multiple networks analysis

Relevância:

100.00% 100.00%

Publicador:

Measures of fit in multiple correspondence analysis of crisp and fuzzy coded data

Relevância:

100.00% 100.00%

Publicador:

Resumo:

When continuous data are coded to categorical variables, two types of coding are possible: crisp coding in the form of indicator, or dummy, variables with values either 0 or 1; or fuzzy coding where each observation is transformed to a set of "degrees of membership" between 0 and 1, using co-called membership functions. It is well known that the correspondence analysis of crisp coded data, namely multiple correspondence analysis, yields principal inertias (eigenvalues) that considerably underestimate the quality of the solution in a low-dimensional space. Since the crisp data only code the categories to which each individual case belongs, an alternative measure of fit is simply to count how well these categories are predicted by the solution. Another approach is to consider multiple correspondence analysis equivalently as the analysis of the Burt matrix (i.e., the matrix of all two-way cross-tabulations of the categorical variables), and then perform a joint correspondence analysis to fit just the off-diagonal tables of the Burt matrix - the measure of fit is then computed as the quality of explaining these tables only. The correspondence analysis of fuzzy coded data, called "fuzzy multiple correspondence analysis", suffers from the same problem, albeit attenuated. Again, one can count how many correct predictions are made of the categories which have highest degree of membership. But here one can also defuzzify the results of the analysis to obtain estimated values of the original data, and then calculate a measure of fit in the familiar percentage form, thanks to the resultant orthogonal decomposition of variance. Furthermore, if one thinks of fuzzy multiple correspondence analysis as explaining the two-way associations between variables, a fuzzy Burt matrix can be computed and the same strategy as in the crisp case can be applied to analyse the off-diagonal part of this matrix. In this paper these alternative measures of fit are defined and applied to a data set of continuous meteorological variables, which are coded crisply and fuzzily into three categories. Measuring the fit is further discussed when the data set consists of a mixture of discrete and continuous variables.

Computation of multiple correspondence analysis, with code in R

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The generalization of simple correspondence analysis, for two categorical variables, to multiple correspondence analysis where they may be three or more variables, is not straighforward, both from a mathematical and computational point of view. In this paper we detail the exact computational steps involved in performing a multiple correspondence analysis, including the special aspects of adjusting the principal inertias to correct the percentages of inertia, supplementary points and subset analysis. Furthermore, we give the algorithm for joint correspondence analysis where the cross-tabulations of all unique pairs of variables are analysed jointly. The code in the R language for every step of the computations is given, as well as the results of each computation.

Multiple correspondence analysis of a subset of response categories

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In the analysis of multivariate categorical data, typically the analysis of questionnaire data, it is often advantageous, for substantive and technical reasons, to analyse a subset of response categories. In multiple correspondence analysis, where each category is coded as a column of an indicator matrix or row and column of Burt matrix, it is not correct to simply analyse the corresponding submatrix of data, since the whole geometric structure is different for the submatrix . A simple modification of the correspondence analysis algorithm allows the overall geometric structure of the complete data set to be retained while calculating the solution for the selected subset of points. This strategy is useful for analysing patterns of response amongst any subset of categories and relating these patterns to demographic factors, especially for studying patterns of particular responses such as missing and neutral responses. The methodology is illustrated using data from the International Social Survey Program on Family and Changing Gender Roles in 1994.

Perceptual mapping of practical ethics along the value chain: A multiple correspondence analysis with industry and cultural indices as supplementary variables

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper presents findings from a study investigating a firm s ethical practices along the value chain. In so doing we attempt to better understand potential relationships between a firm s ethical stance with its customers and those of its suppliers within a supply chain and identify particular sectoral and cultural influences that might impinge on this. Drawing upon a database comprising of 667 industrial firms from 27 different countries, we found that ethical practices begin with the firm s relationship with its customers, the characteristics of which then influence the ethical stance with the firm s suppliers within the supply chain. Importantly, market structure along with some key cultural characteristics were also found to exert significant influence on the implementation of ethical policies in these firms.

Discriminant analysis of skull morphometric characters in Apodemus sylvaticus, A-flavicollis, and A-alpicola (Mammalia; Rodentia) from the Alps

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A total of 108 Apodemus skulls from Switzerland, Austria, Italy, France and Germany was studied to determine morphological characteristics useful in identifying individuals as Apodemus sylvaticus (Linnaeus, 1758), A. flavicollis (Melchior, 1834) or A. alpicola Heinrich, 1952. The original assignment of the samples to the three species was based on molar cusp morphology, body proportions, pelage coloration, and allozyme analysis. The 24 measured cranial characters used together accurately discriminated between the three species and correctly classified 100% of the individuals to species. A stepwise discriminant function analysis showed that 6 cranial characters are sufficient to differentiate between the three species, with a correct classification above 97%. Fisher's linear discriminant function coefficients can be used directly for classification of unknown specimens.

Flooding extent cartography with Landsat TM imagery and regularized kernel Fisher's discriminant analysis

Relevância:

100.00% 100.00%

Publicador:

«
1
2
3
4
5
6
7
8
...
60
61
»