12 resultados para contingency table

em Consorci de Serveis Universitaris de Catalunya (CSUC), Spain


Relevância:

70.00% 70.00%

Publicador:

Resumo:

By using suitable parameters, we present a uni¯ed aproach for describing four methods for representing categorical data in a contingency table. These methods include:correspondence analysis (CA), the alternative approach using Hellinger distance (HD),the log-ratio (LR) alternative, which is appropriate for compositional data, and theso-called non-symmetrical correspondence analysis (NSCA). We then make an appropriate comparison among these four methods and some illustrative examples are given.Some approaches based on cumulative frequencies are also linked and studied usingmatrices.Key words: Correspondence analysis, Hellinger distance, Non-symmetrical correspondence analysis, log-ratio analysis, Taguchi inertia

Relevância:

60.00% 60.00%

Publicador:

Resumo:

We compare correspondance análisis to the logratio approach based on compositional data. We also compare correspondance análisis and an alternative approach using Hellinger distance, for representing categorical data in a contingency table. We propose a coefficient which globally measures the similarity between these approaches. This coefficient can be decomposed into several components, one component for each principal dimension, indicating the contribution of the dimensions to the difference between the two representations. These three methods of representation can produce quite similar results. One illustrative example is given

Relevância:

60.00% 60.00%

Publicador:

Resumo:

A joint distribution of two discrete random variables with finite support can be displayed as a two way table of probabilities adding to one. Assume that this table hasn rows and m columns and all probabilities are non-null. This kind of table can beseen as an element in the simplex of n · m parts. In this context, the marginals areidentified as compositional amalgams, conditionals (rows or columns) as subcompositions. Also, simplicial perturbation appears as Bayes theorem. However, the Euclideanelements of the Aitchison geometry of the simplex can also be translated into the tableof probabilities: subspaces, orthogonal projections, distances.Two important questions are addressed: a) given a table of probabilities, which isthe nearest independent table to the initial one? b) which is the largest orthogonalprojection of a row onto a column? or, equivalently, which is the information in arow explained by a column, thus explaining the interaction? To answer these questionsthree orthogonal decompositions are presented: (1) by columns and a row-wise geometric marginal, (2) by rows and a columnwise geometric marginal, (3) by independenttwo-way tables and fully dependent tables representing row-column interaction. Animportant result is that the nearest independent table is the product of the two (rowand column)-wise geometric marginal tables. A corollary is that, in an independenttable, the geometric marginals conform with the traditional (arithmetic) marginals.These decompositions can be compared with standard log-linear models.Key words: balance, compositional data, simplex, Aitchison geometry, composition,orthonormal basis, arithmetic and geometric marginals, amalgam, dependence measure,contingency table

Relevância:

60.00% 60.00%

Publicador:

Resumo:

We compare two methods for visualising contingency tables and developa method called the ratio map which combines the good properties of both.The first is a biplot based on the logratio approach to compositional dataanalysis. This approach is founded on the principle of subcompositionalcoherence, which assures that results are invariant to considering subsetsof the composition. The second approach, correspondence analysis, isbased on the chi-square approach to contingency table analysis. Acornerstone of correspondence analysis is the principle of distributionalequivalence, which assures invariance in the results when rows or columnswith identical conditional proportions are merged. Both methods may bedescribed as singular value decompositions of appropriately transformedmatrices. Correspondence analysis includes a weighting of the rows andcolumns proportional to the margins of the table. If this idea of row andcolumn weights is introduced into the logratio biplot, we obtain a methodwhich obeys both principles of subcompositional coherence and distributionalequivalence.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Although correspondence analysis is now widely available in statistical software packages and applied in a variety of contexts, notably the social and environmental sciences, there are still some misconceptions about this method as well as unresolved issues which remain controversial to this day. In this paper we hope to settle these matters, namely (i) the way CA measures variance in a two-way table and how to compare variances between tables of different sizes, (ii) the influence, or rather lack of influence, of outliers in the usual CA maps, (iii) the scaling issue and the biplot interpretation of maps,(iv) whether or not to rotate a solution, and (v) statistical significance of results.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

In the scope of the European project Hydroptimet, INTERREG IIIB-MEDOCC programme, limited area model (LAM) intercomparison of intense events that produced many damages to people and territory is performed. As the comparison is limited to single case studies, the work is not meant to provide a measure of the different models' skill, but to identify the key model factors useful to give a good forecast on such a kind of meteorological phenomena. This work focuses on the Spanish flash-flood event, also known as "Montserrat-2000" event. The study is performed using forecast data from seven operational LAMs, placed at partners' disposal via the Hydroptimet ftp site, and observed data from Catalonia rain gauge network. To improve the event analysis, satellite rainfall estimates have been also considered. For statistical evaluation of quantitative precipitation forecasts (QPFs), several non-parametric skill scores based on contingency tables have been used. Furthermore, for each model run it has been possible to identify Catalonia regions affected by misses and false alarms using contingency table elements. Moreover, the standard "eyeball" analysis of forecast and observed precipitation fields has been supported by the use of a state-of-the-art diagnostic method, the contiguous rain area (CRA) analysis. This method allows to quantify the spatial shift forecast error and to identify the error sources that affected each model forecasts. High-resolution modelling and domain size seem to have a key role for providing a skillful forecast. Further work is needed to support this statement, including verification using a wider observational data set.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A condition needed for testing nested hypotheses from a Bayesianviewpoint is that the prior for the alternative model concentratesmass around the small, or null, model. For testing independencein contingency tables, the intrinsic priors satisfy this requirement.Further, the degree of concentration of the priors is controlled bya discrete parameter m, the training sample size, which plays animportant role in the resulting answer regardless of the samplesize.In this paper we study robustness of the tests of independencein contingency tables with respect to the intrinsic priors withdifferent degree of concentration around the null, and comparewith other “robust” results by Good and Crook. Consistency ofthe intrinsic Bayesian tests is established.We also discuss conditioning issues and sampling schemes,and argue that conditioning should be on either one margin orthe table total, but not on both margins.Examples using real are simulated data are given

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this note we quantify to what extent indirect taxation influences and distorts prices. To do so we use the networked accounting structure of the most recent input-output table of Catalonia, an autonomous region of Spain, to model price formation. The role of indirect taxation is considered both from a classical value perspective and a more neoclassical flavoured one. We show that they would yield equivalent results under some basic premises. The neoclassical perspective, however, offers a bit more flexibility to distinguish among different tax figures and hence provide a clearer disaggregate picture of how an indirect tax ends up affecting, and by how much, the cost structure.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We will present an analysis of data from a literature review and semi-structured interviews with experts on OER, to identify different aspects of OER business models and to establish how the success of the OER initiatives is measured. The results collected thus far show that two different business models for OER initiatives exist, but no data on their success or failure is published. We propose a framework for measuring success of OER initiatives.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Aquest treball consisteix en la realització d'un estudi al voltant dels anomenats serious games, jocs destinats a l'aprenentatge. Concretament, el projecte es centra en els serious games d'àmbit sanitari. A més d'un estudi de l'art, el treball consta del desenvolupament d'un serious game anomenat Optable per a la pràctica de la preparació del material quirúrgic d'una taula d'operacions. Aquesta aplicació ha estat desenvolupada sota llicència GNU GPL.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Power transformations of positive data tables, prior to applying the correspondence analysis algorithm, are shown to open up a family of methods with direct connections to the analysis of log-ratios. Two variations of this idea are illustrated. The first approach is simply to power the original data and perform a correspondence analysis this method is shown to converge to unweighted log-ratio analysis as the power parameter tends to zero. The second approach is to apply the power transformation to thecontingency ratios, that is the values in the table relative to expected values based on the marginals this method converges to weighted log-ratio analysis, or the spectral map. Two applications are described: first, a matrix of population genetic data which is inherently two-dimensional, and second, a larger cross-tabulation with higher dimensionality, from a linguistic analysis of several books.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We consider two fundamental properties in the analysis of two-way tables of positive data: the principle of distributional equivalence, one of the cornerstones of correspondence analysis of contingency tables, and the principle of subcompositional coherence, which forms the basis of compositional data analysis. For an analysis to be subcompositionally coherent, it suffices to analyse the ratios of the data values. The usual approach to dimension reduction in compositional data analysis is to perform principal component analysis on the logarithms of ratios, but this method does not obey the principle of distributional equivalence. We show that by introducing weights for the rows and columns, the method achieves this desirable property. This weighted log-ratio analysis is theoretically equivalent to spectral mapping , a multivariate method developed almost 30 years ago for displaying ratio-scale data from biological activity spectra. The close relationship between spectral mapping and correspondence analysis is also explained, as well as their connection with association modelling. The weighted log-ratio methodology is applied here to frequency data in linguistics and to chemical compositional data in archaeology.