5 resultados para Dendrogram
em Consorci de Serveis Universitaris de Catalunya (CSUC), Spain
Resumo:
The use of orthonormal coordinates in the simplex and, particularly, balance coordinates, has suggested the use of a dendrogram for the exploratory analysis of compositional data. The dendrogram is based on a sequential binary partition of a compositional vector into groups of parts. At each step of a partition, one group of parts isdivided into two new groups, and a balancing axis in the simplex between both groupsis defined. The set of balancing axes constitutes an orthonormal basis, and the projections of the sample on them are orthogonal coordinates. They can be represented in adendrogram-like graph showing: (a) the way of grouping parts of the compositional vector; (b) the explanatory role of each subcomposition generated in the partition process;(c) the decomposition of the total variance into balance components associated witheach binary partition; (d) a box-plot of each balance. This representation is useful tohelp the interpretation of balance coordinates; to identify which are the most explanatory coordinates; and to describe the whole sample in a single diagram independentlyof the number of parts of the sample
Resumo:
Within the special geometry of the simplex, the sample space of compositional data, compositional orthonormal coordinates allow the application of any multivariate statistical approach. The search for meaningful coordinates has suggested balances (between two groups of parts)—based on a sequential binary partition of a D-part composition—and a representation in form of a CoDa-dendrogram. Projected samples are represented in a dendrogram-like graph showing: (a) the way of grouping parts; (b) the explanatory role of subcompositions generated in the partition process; (c) the decomposition of the variance; (d) the center and quantiles of each balance. The representation is useful for the interpretation of balances and to describe the sample in a single diagram independently of the number of parts. Also, samples of two or more populations, as well as several samples from the same population, can be represented in the same graph, as long as they have the same parts registered. The approach is illustrated with an example of food consumption in Europe
Resumo:
The R-package “compositions”is a tool for advanced compositional analysis. Its basicfunctionality has seen some conceptual improvement, containing now some facilitiesto work with and represent ilr bases built from balances, and an elaborated subsys-tem for dealing with several kinds of irregular data: (rounded or structural) zeroes,incomplete observations and outliers. The general approach to these irregularities isbased on subcompositions: for an irregular datum, one can distinguish a “regular” sub-composition (where all parts are actually observed and the datum behaves typically)and a “problematic” subcomposition (with those unobserved, zero or rounded parts, orelse where the datum shows an erratic or atypical behaviour). Systematic classificationschemes are proposed for both outliers and missing values (including zeros) focusing onthe nature of irregularities in the datum subcomposition(s).To compute statistics with values missing at random and structural zeros, a projectionapproach is implemented: a given datum contributes to the estimation of the desiredparameters only on the subcompositon where it was observed. For data sets withvalues below the detection limit, two different approaches are provided: the well-knownimputation technique, and also the projection approach.To compute statistics in the presence of outliers, robust statistics are adapted to thecharacteristics of compositional data, based on the minimum covariance determinantapproach. The outlier classification is based on four different models of outlier occur-rence and Monte-Carlo-based tests for their characterization. Furthermore the packageprovides special plots helping to understand the nature of outliers in the dataset.Keywords: coda-dendrogram, lost values, MAR, missing data, MCD estimator,robustness, rounded zeros
Resumo:
Capsula and seed morphology of W. European species of Euphorbia aggr. flavicoma has been studied. A total of 1500 seeds coming from 13 taxa have been investigated under light microscope, scanning electron microscope and binocular stereoscope. Data were processed by multivariate analysis and the corresponding dendrogram is presented. At de end of the paper, a key is presented allowing to the separation of taxa down to the species level.
Resumo:
Comparative analysis of gene fragments of six housekeeping loci, distributed around the two chromosomes of Vibrio cholerae, has been carried out for a collection of 29 V. cholerae O139 Bengal strains isolated from India during the first epidemic period (1992 to 1993). A toxigenic O1 ElTor strain from the seventh pandemic and an environmental non-O1/non-O139 strain were also included in this study. All loci studied were polymorphic, with a small number of polymorphic sites in the sequenced fragments. The genetic diversity determined for our O139 population is concordant with a previous multilocus enzyme electrophoresis study in which we analyzed the same V. cholerae O139 strains. In both studies we have found a higher genetic diversity than reported previously in other molecular studies. The results of the present work showed that O139 strains clustered in several lineages of the dendrogram generated from the matrix of allelic mismatches between the different genotypes, a finding which does not support the hypothesis previously reported that the O139 serogroup is a unique clone. The statistical analysis performed in the V. cholerae O139 isolates suggested a clonal population structure. Moreover, the application of the Sawyer's test and split decomposition to detect intragenic recombination in the sequenced gene fragments did not indicate the existence of recombination in our O139 population.