163 resultados para singular-value decomposition

em Consorci de Serveis Universitaris de Catalunya (CSUC), Spain


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Biplots are graphical displays of data matrices based on the decomposition of a matrix as the product of two matrices. Elements of these two matrices are used as coordinates for the rows and columns of the data matrix, with an interpretation of the joint presentation that relies on the properties of the scalar product. Because the decomposition is not unique, there are several alternative ways to scale the row and column points of the biplot, which can cause confusion amongst users, especially when software packages are not united in their approach to this issue. We propose a new scaling of the solution, called the standard biplot, which applies equally well to a wide variety of analyses such as correspondence analysis, principal component analysis, log-ratio analysis and the graphical results of a discriminant analysis/MANOVA, in fact to any method based on the singular-value decomposition. The standard biplot also handles data matrices with widely different levels of inherent variance. Two concepts taken from correspondence analysis are important to this idea: the weighting of row and column points, and the contributions made by the points to the solution. In the standard biplot one set of points, usually the rows of the data matrix, optimally represent the positions of the cases or sample units, which are weighted and usually standardized in some way unless the matrix contains values that are comparable in their raw form. The other set of points, usually the columns, is represented in accordance with their contributions to the low-dimensional solution. As for any biplot, the projections of the row points onto vectors defined by the column points approximate the centred and (optionally) standardized data. The method is illustrated with several examples to demonstrate how the standard biplot copes in different situations to give a joint map which needs only one common scale on the principal axes, thus avoiding the problem of enlarging or contracting the scale of one set of points to make the biplot readable. The proposal also solves the problem in correspondence analysis of low-frequency categories that are located on the periphery of the map, giving the false impression that they are important.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We compare two methods for visualising contingency tables and developa method called the ratio map which combines the good properties of both.The first is a biplot based on the logratio approach to compositional dataanalysis. This approach is founded on the principle of subcompositionalcoherence, which assures that results are invariant to considering subsetsof the composition. The second approach, correspondence analysis, isbased on the chi-square approach to contingency table analysis. Acornerstone of correspondence analysis is the principle of distributionalequivalence, which assures invariance in the results when rows or columnswith identical conditional proportions are merged. Both methods may bedescribed as singular value decompositions of appropriately transformedmatrices. Correspondence analysis includes a weighting of the rows andcolumns proportional to the margins of the table. If this idea of row andcolumn weights is introduced into the logratio biplot, we obtain a methodwhich obeys both principles of subcompositional coherence and distributionalequivalence.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Power transformations of positive data tables, prior to applying the correspondence analysis algorithm, are shown to open up a family of methods with direct connections to the analysis of log-ratios. Two variations of this idea are illustrated. The first approach is simply to power the original data and perform a correspondence analysis this method is shown to converge to unweighted log-ratio analysis as the power parameter tends to zero. The second approach is to apply the power transformation to thecontingency ratios, that is the values in the table relative to expected values based on the marginals this method converges to weighted log-ratio analysis, or the spectral map. Two applications are described: first, a matrix of population genetic data which is inherently two-dimensional, and second, a larger cross-tabulation with higher dimensionality, from a linguistic analysis of several books.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In order to interpret the biplot it is necessary to know which points usually variables are the ones that are important contributors to the solution, and this information is available separately as part of the biplot s numerical results. We propose a new scaling of the display, called the contribution biplot, which incorporates this diagnostic directly into the graphical display, showing visually the important contributors and thus facilitating the biplot interpretation and often simplifying the graphical representation considerably. The contribution biplot can be applied to a wide variety of analyses such as correspondence analysis, principal component analysis, log-ratio analysis and the graphical results of a discriminant analysis/MANOVA, in fact to any method based on the singular-value decomposition. In the contribution biplot one set of points, usually the rows of the data matrix, optimally represent the spatial positions of the cases or sample units, according to some distance measure that usually incorporates some form of standardization unless all data are comparable in scale. The other set of points, usually the columns, is represented by vectors that are related to their contributions to the low-dimensional solution. A fringe benefit is that usually only one common scale for row and column points is needed on the principal axes, thus avoiding the problem of enlarging or contracting the scale of one set of points to make the biplot legible. Furthermore, this version of the biplot also solves the problem in correspondence analysis of low-frequency categories that are located on the periphery of the map, giving the false impression that they are important, when they are in fact contributing minimally to the solution.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The generalization of simple correspondence analysis, for two categorical variables, to multiple correspondence analysis where they may be three or more variables, is not straighforward, both from a mathematical and computational point of view. In this paper we detail the exact computational steps involved in performing a multiple correspondence analysis, including the special aspects of adjusting the principal inertias to correct the percentages of inertia, supplementary points and subset analysis. Furthermore, we give the algorithm for joint correspondence analysis where the cross-tabulations of all unique pairs of variables are analysed jointly. The code in the R language for every step of the computations is given, as well as the results of each computation.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We consider two fundamental properties in the analysis of two-way tables of positive data: the principle of distributional equivalence, one of the cornerstones of correspondence analysis of contingency tables, and the principle of subcompositional coherence, which forms the basis of compositional data analysis. For an analysis to be subcompositionally coherent, it suffices to analyse the ratios of the data values. The usual approach to dimension reduction in compositional data analysis is to perform principal component analysis on the logarithms of ratios, but this method does not obey the principle of distributional equivalence. We show that by introducing weights for the rows and columns, the method achieves this desirable property. This weighted log-ratio analysis is theoretically equivalent to spectral mapping , a multivariate method developed almost 30 years ago for displaying ratio-scale data from biological activity spectra. The close relationship between spectral mapping and correspondence analysis is also explained, as well as their connection with association modelling. The weighted log-ratio methodology is applied here to frequency data in linguistics and to chemical compositional data in archaeology.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper establishes a general framework for metric scaling of any distance measure between individuals based on a rectangular individuals-by-variables data matrix. The method allows visualization of both individuals and variables as well as preserving all the good properties of principal axis methods such as principal components and correspondence analysis, based on the singular-value decomposition, including the decomposition of variance into components along principal axes which provide the numerical diagnostics known as contributions. The idea is inspired from the chi-square distance in correspondence analysis which weights each coordinate by an amount calculated from the margins of the data table. In weighted metric multidimensional scaling (WMDS) we allow these weights to be unknown parameters which are estimated from the data to maximize the fit to the original distances. Once this extra weight-estimation step is accomplished, the procedure follows the classical path in decomposing a matrix and displaying its rows and columns in biplots.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A biplot, which is the multivariate generalization of the two-variable scatterplot, can be used to visualize the results of many multivariate techniques, especially those that are based on the singular value decomposition. We consider data sets consisting of continuous-scale measurements, their fuzzy coding and the biplots that visualize them, using a fuzzy version of multiple correspondence analysis. Of special interest is the way quality of fit of the biplot is measured, since it is well-known that regular (i.e., crisp) multiple correspondence analysis seriously under-estimates this measure. We show how the results of fuzzy multiple correspondence analysis can be defuzzified to obtain estimated values of the original data, and prove that this implies an orthogonal decomposition of variance. This permits a measure of fit to be calculated in the familiar form of a percentage of explained variance, which is directly comparable to the corresponding fit measure used in principal component analysis of the original data. The approach is motivated initially by its application to a simulated data set, showing how the fuzzy approach can lead to diagnosing nonlinear relationships, and finally it is applied to a real set of meteorological data.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Many multivariate methods that are apparently distinct can be linked by introducing oneor more parameters in their definition. Methods that can be linked in this way arecorrespondence analysis, unweighted or weighted logratio analysis (the latter alsoknown as "spectral mapping"), nonsymmetric correspondence analysis, principalcomponent analysis (with and without logarithmic transformation of the data) andmultidimensional scaling. In this presentation I will show how several of thesemethods, which are frequently used in compositional data analysis, may be linkedthrough parametrizations such as power transformations, linear transformations andconvex linear combinations. Since the methods of interest here all lead to visual mapsof data, a "movie" can be made where where the linking parameter is allowed to vary insmall steps: the results are recalculated "frame by frame" and one can see the smoothchange from one method to another. Several of these "movies" will be shown, giving adeeper insight into the similarities and differences between these methods.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We consider the joint visualization of two matrices which have common rowsand columns, for example multivariate data observed at two time pointsor split accord-ing to a dichotomous variable. Methods of interest includeprincipal components analysis for interval-scaled data, or correspondenceanalysis for frequency data or ratio-scaled variables on commensuratescales. A simple result in matrix algebra shows that by setting up thematrices in a particular block format, matrix sum and difference componentscan be visualized. The case when we have more than two matrices is alsodiscussed and the methodology is applied to data from the InternationalSocial Survey Program.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The generalization of simple (two-variable) correspondence analysis to more than two categorical variables, commonly referred to as multiple correspondence analysis, is neither obvious nor well-defined. We present two alternative ways of generalizing correspondence analysis, one based on the quantification of the variables and intercorrelation relationships, and the other based on the geometric ideas of simple correspondence analysis. We propose a version of multiple correspondence analysis, with adjusted principal inertias, as the method of choice for the geometric definition, since it contains simple correspondence analysis as an exact special case, which is not the situation of the standard generalizations. We also clarify the issue of supplementary point representation and the properties of joint correspondence analysis, a method that visualizes all two-way relationships between the variables. The methodology is illustrated using data on attitudes to science from the International Social Survey Program on Environment in 1993.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The singular value decomposition and its interpretation as alinear biplot has proved to be a powerful tool for analysing many formsof multivariate data. Here we adapt biplot methodology to the specifficcase of compositional data consisting of positive vectors each of whichis constrained to have unit sum. These relative variation biplots haveproperties relating to special features of compositional data: the studyof ratios, subcompositions and models of compositional relationships. Themethodology is demonstrated on a data set consisting of six-part colourcompositions in 22 abstract paintings, showing how the singular valuedecomposition can achieve an accurate biplot of the colour ratios and howpossible models interrelating the colours can be diagnosed.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We construct a weighted Euclidean distance that approximates any distance or dissimilarity measure between individuals that is based on a rectangular cases-by-variables data matrix. In contrast to regular multidimensional scaling methods for dissimilarity data, the method leads to biplots of individuals and variables while preserving all the good properties of dimension-reduction methods that are based on the singular-value decomposition. The main benefits are the decomposition of variance into components along principal axes, which provide the numerical diagnostics known as contributions, and the estimation of nonnegative weights for each variable. The idea is inspired by the distance functions used in correspondence analysis and in principal component analysis of standardized data, where the normalizations inherent in the distances can be considered as differential weighting of the variables. In weighted Euclidean biplots we allow these weights to be unknown parameters, which are estimated from the data to maximize the fit to the chosen distances or dissimilarities. These weights are estimated using a majorization algorithm. Once this extra weight-estimation step is accomplished, the procedure follows the classical path in decomposing the matrix and displaying its rows and columns in biplots.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper presents a differential synthetic apertureradar (SAR) interferometry (DIFSAR) approach for investigatingdeformation phenomena on full-resolution DIFSAR interferograms.In particular, our algorithm extends the capabilityof the small-baseline subset (SBAS) technique that relies onsmall-baseline DIFSAR interferograms only and is mainly focusedon investigating large-scale deformations with spatial resolutionsof about 100 100 m. The proposed technique is implemented byusing two different sets of data generated at low (multilook data)and full (single-look data) spatial resolution, respectively. Theformer is used to identify and estimate, via the conventional SBAStechnique, large spatial scale deformation patterns, topographicerrors in the available digital elevation model, and possibleatmospheric phase artifacts; the latter allows us to detect, onthe full-resolution residual phase components, structures highlycoherent over time (buildings, rocks, lava, structures, etc.), as wellas their height and displacements. In particular, the estimation ofthe temporal evolution of these local deformations is easily implementedby applying the singular value decomposition technique.The proposed algorithm has been tested with data acquired by theEuropean Remote Sensing satellites relative to the Campania area(Italy) and validated by using geodetic measurements.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper presents a Bayesian approach to the design of transmit prefiltering matrices in closed-loop schemes robust to channel estimation errors. The algorithms are derived for a multiple-input multiple-output (MIMO) orthogonal frequency division multiplexing (OFDM) system. Two different optimizationcriteria are analyzed: the minimization of the mean square error and the minimization of the bit error rate. In both cases, the transmitter design is based on the singular value decomposition (SVD) of the conditional mean of the channel response, given the channel estimate. The performance of the proposed algorithms is analyzed,and their relationship with existing algorithms is indicated. As withother previously proposed solutions, the minimum bit error rate algorithmconverges to the open-loop transmission scheme for very poor CSI estimates.