9 resultados para n-way analysis
em Universitat de Girona, Spain
Resumo:
Starting with logratio biplots for compositional data, which are based on the principle of subcompositional coherence, and then adding weights, as in correspondence analysis, we rediscover Lewi's spectral map and many connections to analyses of two-way tables of non-negative data. Thanks to the weighting, the method also achieves the property of distributional equivalence
Resumo:
”compositions” is a new R-package for the analysis of compositional and positive data. It contains four classes corresponding to the four different types of compositional and positive geometry (including the Aitchison geometry). It provides means for computation, plotting and high-level multivariate statistical analysis in all four geometries. These geometries are treated in an fully analogous way, based on the principle of working in coordinates, and the object-oriented programming paradigm of R. In this way, called functions automatically select the most appropriate type of analysis as a function of the geometry. The graphical capabilities include ternary diagrams and tetrahedrons, various compositional plots (boxplots, barplots, piecharts) and extensive graphical tools for principal components. Afterwards, ortion and proportion lines, straight lines and ellipses in all geometries can be added to plots. The package is accompanied by a hands-on-introduction, documentation for every function, demos of the graphical capabilities and plenty of usage examples. It allows direct and parallel computation in all four vector spaces and provides the beginner with a copy-and-paste style of data analysis, while letting advanced users keep the functionality and customizability they demand of R, as well as all necessary tools to add own analysis routines. A complete example is included in the appendix
Resumo:
There are two principal chemical concepts that are important for studying the natural environment. The first one is thermodynamics, which describes whether a system is at equilibrium or can spontaneously change by chemical reactions. The second main concept is how fast chemical reactions (kinetics or rate of chemical change) take place whenever they start. In this work we examine a natural system in which both thermodynamics and kinetic factors are important in determining the abundance of NH+4 , NO−2 and NO−3 in superficial waters. Samples were collected in the Arno Basin (Tuscany, Italy), a system in which natural and antrophic effects both contribute to highly modify the chemical composition of water. Thermodynamical modelling based on the reduction-oxidation reactions involving the passage NH+4 -> NO−2 -> NO−3 in equilibrium conditions has allowed to determine the Eh redox potential values able to characterise the state of each sample and, consequently, of the fluid environment from which it was drawn. Just as pH expresses the concentration of H+ in solution, redox potential is used to express the tendency of an environment to receive or supply electrons. In this context, oxic environments, as those of river systems, are said to have a high redox potential because O2 is available as an electron acceptor. Principles of thermodynamics and chemical kinetics allow to obtain a model that often does not completely describe the reality of natural systems. Chemical reactions may indeed fail to achieve equilibrium because the products escape from the site of the rection or because reactions involving the trasformation are very slow, so that non-equilibrium conditions exist for long periods. Moreover, reaction rates can be sensitive to poorly understood catalytic effects or to surface effects, while variables as concentration (a large number of chemical species can coexist and interact concurrently), temperature and pressure can have large gradients in natural systems. By taking into account this, data of 91 water samples have been modelled by using statistical methodologies for compositional data. The application of log–contrast analysis has allowed to obtain statistical parameters to be correlated with the calculated Eh values. In this way, natural conditions in which chemical equilibrium is hypothesised, as well as underlying fast reactions, are compared with those described by a stochastic approach
Resumo:
Most of economic literature has presented its analysis under the assumption of homogeneous capital stock. However, capital composition differs across countries. What has been the pattern of capital composition associated with World economies? We make an exploratory statistical analysis based on compositional data transformed by Aitchinson logratio transformations and we use tools for visualizing and measuring statistical estimators of association among the components. The goal is to detect distinctive patterns in the composition. As initial findings could be cited that: 1. Sectorial components behaved in a correlated way, building industries on one side and , in a less clear view, equipment industries on the other. 2. Full sample estimation shows a negative correlation between durable goods component and other buildings component and between transportation and building industries components. 3. Countries with zeros in some components are mainly low income countries at the bottom of the income category and behaved in a extreme way distorting main results observed in the full sample. 4. After removing these extreme cases, conclusions seem not very sensitive to the presence of another isolated cases
Resumo:
Several eco-toxicological studies have shown that insectivorous mammals, due to their feeding habits, easily accumulate high amounts of pollutants in relation to other mammal species. To assess the bio-accumulation levels of toxic metals and their in°uence on essential metals, we quantified the concentration of 19 elements (Ca, K, Fe, B, P, S, Na, Al, Zn, Ba, Rb, Sr, Cu, Mn, Hg, Cd, Mo, Cr and Pb) in bones of 105 greater white-toothed shrews (Crocidura russula) from a polluted (Ebro Delta) and a control (Medas Islands) area. Since chemical contents of a bio-indicator are mainly compositional data, conventional statistical analyses currently used in eco-toxicology can give misleading results. Therefore, to improve the interpretation of the data obtained, we used statistical techniques for compositional data analysis to define groups of metals and to evaluate the relationships between them, from an inter-population viewpoint. Hypothesis testing on the adequate balance-coordinates allow us to confirm intuition based hypothesis and some previous results. The main statistical goal was to test equal means of balance-coordinates for the two defined populations. After checking normality, one-way ANOVA or Mann-Whitney tests were carried out for the inter-group balances
Resumo:
A joint distribution of two discrete random variables with finite support can be displayed as a two way table of probabilities adding to one. Assume that this table has n rows and m columns and all probabilities are non-null. This kind of table can be seen as an element in the simplex of n · m parts. In this context, the marginals are identified as compositional amalgams, conditionals (rows or columns) as subcompositions. Also, simplicial perturbation appears as Bayes theorem. However, the Euclidean elements of the Aitchison geometry of the simplex can also be translated into the table of probabilities: subspaces, orthogonal projections, distances. Two important questions are addressed: a) given a table of probabilities, which is the nearest independent table to the initial one? b) which is the largest orthogonal projection of a row onto a column? or, equivalently, which is the information in a row explained by a column, thus explaining the interaction? To answer these questions three orthogonal decompositions are presented: (1) by columns and a row-wise geometric marginal, (2) by rows and a columnwise geometric marginal, (3) by independent two-way tables and fully dependent tables representing row-column interaction. An important result is that the nearest independent table is the product of the two (row and column)-wise geometric marginal tables. A corollary is that, in an independent table, the geometric marginals conform with the traditional (arithmetic) marginals. These decompositions can be compared with standard log-linear models. Key words: balance, compositional data, simplex, Aitchison geometry, composition, orthonormal basis, arithmetic and geometric marginals, amalgam, dependence measure, contingency table
Resumo:
Many multivariate methods that are apparently distinct can be linked by introducing one or more parameters in their definition. Methods that can be linked in this way are correspondence analysis, unweighted or weighted logratio analysis (the latter also known as "spectral mapping"), nonsymmetric correspondence analysis, principal component analysis (with and without logarithmic transformation of the data) and multidimensional scaling. In this presentation I will show how several of these methods, which are frequently used in compositional data analysis, may be linked through parametrizations such as power transformations, linear transformations and convex linear combinations. Since the methods of interest here all lead to visual maps of data, a "movie" can be made where where the linking parameter is allowed to vary in small steps: the results are recalculated "frame by frame" and one can see the smooth change from one method to another. Several of these "movies" will be shown, giving a deeper insight into the similarities and differences between these methods
Resumo:
Factor analysis as frequent technique for multivariate data inspection is widely used also for compositional data analysis. The usual way is to use a centered logratio (clr) transformation to obtain the random vector y of dimension D. The factor model is then y = Λf + e (1) with the factors f of dimension k < D, the error term e, and the loadings matrix Λ. Using the usual model assumptions (see, e.g., Basilevsky, 1994), the factor analysis model (1) can be written as Cov(y) = ΛΛT + ψ (2) where ψ = Cov(e) has a diagonal form. The diagonal elements of ψ as well as the loadings matrix Λ are estimated from an estimation of Cov(y). Given observed clr transformed data Y as realizations of the random vector y. Outliers or deviations from the idealized model assumptions of factor analysis can severely effect the parameter estimation. As a way out, robust estimation of the covariance matrix of Y will lead to robust estimates of Λ and ψ in (2), see Pison et al. (2003). Well known robust covariance estimators with good statistical properties, like the MCD or the S-estimators (see, e.g. Maronna et al., 2006), rely on a full-rank data matrix Y which is not the case for clr transformed data (see, e.g., Aitchison, 1986). The isometric logratio (ilr) transformation (Egozcue et al., 2003) solves this singularity problem. The data matrix Y is transformed to a matrix Z by using an orthonormal basis of lower dimension. Using the ilr transformed data, a robust covariance matrix C(Z) can be estimated. The result can be back-transformed to the clr space by C(Y ) = V C(Z)V T where the matrix V with orthonormal columns comes from the relation between the clr and the ilr transformation. Now the parameters in the model (2) can be estimated (Basilevsky, 1994) and the results have a direct interpretation since the links to the original variables are still preserved. The above procedure will be applied to data from geochemistry. Our special interest is on comparing the results with those of Reimann et al. (2002) for the Kola project data
Resumo:
We present a method for analyzing the curvature (second derivatives) of the conical intersection hyperline at an optimized critical point. Our method uses the projected Hessians of the degenerate states after elimination of the two branching space coordinates, and is equivalent to a frequency calculation on a single Born-Oppenheimer potential-energy surface. Based on the projected Hessians, we develop an equation for the energy as a function of a set of curvilinear coordinates where the degeneracy is preserved to second order (i.e., the conical intersection hyperline). The curvature of the potential-energy surface in these coordinates is the curvature of the conical intersection hyperline itself, and thus determines whether one has a minimum or saddle point on the hyperline. The equation used to classify optimized conical intersection points depends in a simple way on the first- and second-order degeneracy splittings calculated at these points. As an example, for fulvene, we show that the two optimized conical intersection points of C2v symmetry are saddle points on the intersection hyperline. Accordingly, there are further intersection points of lower energy, and one of C2 symmetry - presented here for the first time - is found to be the global minimum in the intersection space