Biblioteca Digital

34 resultados para Tables(data)

em Consorci de Serveis Universitaris de Catalunya (CSUC), Spain

Coherent forecasting of multiple-decrement life tables: a test using Japanese cause of death data

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Planners in public and private institutions would like coherent forecasts of the components of age-specic mortality, such as causes of death. This has been di cult toachieve because the relative values of the forecast components often fail to behave ina way that is coherent with historical experience. In addition, when the group forecasts are combined the result is often incompatible with an all-groups forecast. It hasbeen shown that cause-specic mortality forecasts are pessimistic when compared withall-cause forecasts (Wilmoth, 1995). This paper abandons the conventional approachof using log mortality rates and forecasts the density of deaths in the life table. Sincethese values obey a unit sum constraint for both conventional single-decrement life tables (only one absorbing state) and multiple-decrement tables (more than one absorbingstate), they are intrinsically relative rather than absolute values across decrements aswell as ages. Using the methods of Compositional Data Analysis pioneered by Aitchison(1986), death densities are transformed into the real space so that the full range of multivariate statistics can be applied, then back-transformed to positive values so that theunit sum constraint is honoured. The structure of the best-known, single-decrementmortality-rate forecasting model, devised by Lee and Carter (1992), is expressed incompositional form and the results from the two models are compared. The compositional model is extended to a multiple-decrement form and used to forecast mortalityby cause of death for Japan

Distributional equivalence and subcompositional coherence in the analysis of contingency tables, ratio-scale measurements and compositional data

Relevância:

40.00% 40.00%

Publicador:

Resumo:

We consider two fundamental properties in the analysis of two-way tables of positive data: the principle of distributional equivalence, one of the cornerstones of correspondence analysis of contingency tables, and the principle of subcompositional coherence, which forms the basis of compositional data analysis. For an analysis to be subcompositionally coherent, it suffices to analyse the ratios of the data values. The usual approach to dimension reduction in compositional data analysis is to perform principal component analysis on the logarithms of ratios, but this method does not obey the principle of distributional equivalence. We show that by introducing weights for the rows and columns, the method achieves this desirable property. This weighted log-ratio analysis is theoretically equivalent to spectral mapping , a multivariate method developed almost 30 years ago for displaying ratio-scale data from biological activity spectra. The close relationship between spectral mapping and correspondence analysis is also explained, as well as their connection with association modelling. The weighted log-ratio methodology is applied here to frequency data in linguistics and to chemical compositional data in archaeology.

Compositional data and Simpson's paradox

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Simpson's paradox, also known as amalgamation or aggregation paradox, appears whendealing with proportions. Proportions are by construction parts of a whole, which canbe interpreted as compositions assuming they only carry relative information. TheAitchison inner product space structure of the simplex, the sample space of compositions, explains the appearance of the paradox, given that amalgamation is a nonlinearoperation within that structure. Here we propose to use balances, which are specificelements of this structure, to analyse situations where the paradox might appear. Withthe proposed approach we obtain that the centre of the tables analysed is a naturalway to compare them, which avoids by construction the possibility of a paradox.Key words: Aitchison geometry, geometric mean, orthogonal projection

A unified approach for representing rows and columns in contingency tables

Relevância:

30.00% 30.00%

Publicador:

Resumo:

By using suitable parameters, we present a uni¯ed aproach for describing four methods for representing categorical data in a contingency table. These methods include:correspondence analysis (CA), the alternative approach using Hellinger distance (HD),the log-ratio (LR) alternative, which is appropriate for compositional data, and theso-called non-symmetrical correspondence analysis (NSCA). We then make an appropriate comparison among these four methods and some illustrative examples are given.Some approaches based on cumulative frequencies are also linked and studied usingmatrices.Key words: Correspondence analysis, Hellinger distance, Non-symmetrical correspondence analysis, log-ratio analysis, Taguchi inertia

Intrinsic test of independence in contingency tables

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A condition needed for testing nested hypotheses from a Bayesianviewpoint is that the prior for the alternative model concentratesmass around the small, or null, model. For testing independencein contingency tables, the intrinsic priors satisfy this requirement.Further, the degree of concentration of the priors is controlled bya discrete parameter m, the training sample size, which plays animportant role in the resulting answer regardless of the samplesize.In this paper we study robustness of the tests of independencein contingency tables with respect to the intrinsic priors withdifferent degree of concentration around the null, and comparewith other “robust” results by Good and Crook. Consistency ofthe intrinsic Bayesian tests is established.We also discuss conditioning issues and sampling schemes,and argue that conditioning should be on either one margin orthe table total, but not on both margins.Examples using real are simulated data are given

Performance assessment and league tables. Comparing like with like.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We formulate performance assessment as a problem of causal analysis and outline an approach based on the missing data principle for its solution. It is particularly relevant in the context of so-called league tables for educational, health-care and other public-service institutions. The proposed solution avoids comparisons of institutions that have substantially different clientele (intake).

Measures of fit in multiple correspondence analysis of crisp and fuzzy coded data

Relevância:

30.00% 30.00%

Publicador:

Resumo:

When continuous data are coded to categorical variables, two types of coding are possible: crisp coding in the form of indicator, or dummy, variables with values either 0 or 1; or fuzzy coding where each observation is transformed to a set of "degrees of membership" between 0 and 1, using co-called membership functions. It is well known that the correspondence analysis of crisp coded data, namely multiple correspondence analysis, yields principal inertias (eigenvalues) that considerably underestimate the quality of the solution in a low-dimensional space. Since the crisp data only code the categories to which each individual case belongs, an alternative measure of fit is simply to count how well these categories are predicted by the solution. Another approach is to consider multiple correspondence analysis equivalently as the analysis of the Burt matrix (i.e., the matrix of all two-way cross-tabulations of the categorical variables), and then perform a joint correspondence analysis to fit just the off-diagonal tables of the Burt matrix - the measure of fit is then computed as the quality of explaining these tables only. The correspondence analysis of fuzzy coded data, called "fuzzy multiple correspondence analysis", suffers from the same problem, albeit attenuated. Again, one can count how many correct predictions are made of the categories which have highest degree of membership. But here one can also defuzzify the results of the analysis to obtain estimated values of the original data, and then calculate a measure of fit in the familiar percentage form, thanks to the resultant orthogonal decomposition of variance. Furthermore, if one thinks of fuzzy multiple correspondence analysis as explaining the two-way associations between variables, a fuzzy Burt matrix can be computed and the same strategy as in the crisp case can be applied to analyse the off-diagonal part of this matrix. In this paper these alternative measures of fit are defined and applied to a data set of continuous meteorological variables, which are coded crisply and fuzzily into three categories. Measuring the fit is further discussed when the data set consists of a mixture of discrete and continuous variables.

Performance assessment and league tables. Comparing like with like

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We formulate performance assessment as a problem of causal analysis and outline an approach based on the missing data principle for its solution. It is particularly relevant in the context of so-called league tables for educational, health-care and other public-service institutions. The proposed solution avoids comparisons of institutions that have substantially different clientele (intake).

An adaptation of correspondence analysis for square tables

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The application of correspondence analysis to square asymmetrictables is often unsuccessful because of the strong role played by thediagonal entries of the matrix, obscuring the data off the diagonal. A simplemodification of the centering of the matrix, coupled with the correspondingchange in row and column masses and row and column metrics, allows the tableto be decomposed into symmetric and skew--symmetric components, which canthen be analyzed separately. The symmetric and skew--symmetric analyses canbe performed using a simple correspondence analysis program if the data areset up in a special block format.

Dynamic graphics of parametrically linked multivariate methods used in compositional data analysis

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Many multivariate methods that are apparently distinct can be linked by introducing oneor more parameters in their definition. Methods that can be linked in this way arecorrespondence analysis, unweighted or weighted logratio analysis (the latter alsoknown as "spectral mapping"), nonsymmetric correspondence analysis, principalcomponent analysis (with and without logarithmic transformation of the data) andmultidimensional scaling. In this presentation I will show how several of thesemethods, which are frequently used in compositional data analysis, may be linkedthrough parametrizations such as power transformations, linear transformations andconvex linear combinations. Since the methods of interest here all lead to visual mapsof data, a "movie" can be made where where the linking parameter is allowed to vary insmall steps: the results are recalculated "frame by frame" and one can see the smoothchange from one method to another. Several of these "movies" will be shown, giving adeeper insight into the similarities and differences between these methods.

Correspondence analysis of two transition tables

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The case of two transition tables is considered, that is two squareasymmetric matrices of frequencies where the rows and columns of thematrices are the same objects observed at three different timepoints. Different ways of visualizing the tables, either separatelyor jointly, are examined. We generalize an existing idea where asquare matrix is descomposed into symmetric and skew-symmetric partsto two matrices, leading to a decomposition into four components: (1)average symmetric, (2) average skew-symmetric, (3) symmetricdifference from average, and (4) skew-symmetric difference fromaverage. The method is illustrated with an artificial example and anexample using real data from a study of changing values over threegenerations.

The work of Spanish men. A quantitative analysis based on census data, 1900-1970

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper presents a historical examination of employment in old age in Spain, in order to characterize this labour segment and identify and analyse its specific problems. One of these problems is the life-cycle deskilling process, already shown for certain national cases. This study explores whether this hypothesis also holds in Spain. The perspective used is essentially quantitative, as our analysis is based on the age-profession tables in Spanish population censuses from 1900 to 1970.

The work of Spanish men. A quantitative analysis based on census data, 1900-1970

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper presents a historical examination of employment in old age in Spain, in order to characterize this labour segment and identify and analyse its specific problems. One of these problems is the life-cycle deskilling process, already shown for certain national cases. This study explores whether this hypothesis also holds in Spain. The perspective used is essentially quantitative, as our analysis is based on the age-profession tables in Spanish population censuses from 1900 to 1970.

Phenol-Explorer 3.0: a major update of the Phenol-Explorer database to incorporate data on the effects of food processing on polyphenol content

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Polyphenols are a major class of bioactive phytochemicals whose consumption may play a role in the prevention of a number of chronic diseases such as cardiovascular diseases, type II diabetes and cancers. Phenol-Explorer, launched in 2009, is the only freely available web-based database on the content of polyphenols in food and their in vivo metabolism and pharmacokinetics. Here we report the third release of the database (Phenol-Explorer 3.0), which adds data on the effects of food processing on polyphenol contents in foods. Data on >100 foods, covering 161 polyphenols or groups of polyphenols before and after processing, were collected from 129 peer-reviewed publications and entered into new tables linked to the existing relational design. The effect of processing on polyphenol content is expressed in the form of retention factor coefficients, or the proportion of a given polyphenol retained after processing, adjusted for change in water content. The result is the first database on the effects of food processing on polyphenol content and, following the model initially defined for Phenol-Explorer, all data may be traced back to original sources. The new update will allow polyphenol scientists to more accurately estimate polyphenol exposure from dietary surveys. Database URL: http://www.phenol-explorer.eu

Extending the roughness of the data via transitive closures of similarity indexes

Relevância:

30.00% 30.00%

Publicador:

Resumo:

One main assumption in the theory of rough sets applied to information tables is that the elements that exhibit the same information are indiscernible (similar) and form blocks that can be understood as elementary granules of knowledge about the universe. We propose a variant of this concept defining a measure of similarity between the elements of the universe in order to consider that two objects can be indiscernible even though they do not share all the attribute values because the knowledge is partial or uncertain. The set of similarities define a matrix of a fuzzy relation satisfying reflexivity and symmetry but transitivity thus a partition of the universe is not attained. This problem can be solved calculating its transitive closure what ensure a partition for each level belonging to the unit interval [0,1]. This procedure allows generalizing the theory of rough sets depending on the minimum level of similarity accepted. This new point of view increases the rough character of the data because increases the set of indiscernible objects. Finally, we apply our results to a not real application to be capable to remark the differences and the improvements between this methodology and the classical one

«
1
2
3
»