85 resultados para Estadística matemática


Relevância:

60.00% 60.00%

Publicador:

Resumo:

The amalgamation operation is frequently used to reduce the number of parts of compositional data but it is a non-linear operation in the simplex with the usual geometry, the Aitchison geometry. The concept of balances between groups, a particular coordinate system designed over binary partitions of the parts, could be an alternative to the amalgamation in some cases. In this work we discuss the proper application of both concepts using a real data set corresponding to behavioral measures of pregnant sows

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Planners in public and private institutions would like coherent forecasts of the components of age-specic mortality, such as causes of death. This has been di cult to achieve because the relative values of the forecast components often fail to behave in a way that is coherent with historical experience. In addition, when the group forecasts are combined the result is often incompatible with an all-groups forecast. It has been shown that cause-specic mortality forecasts are pessimistic when compared with all-cause forecasts (Wilmoth, 1995). This paper abandons the conventional approach of using log mortality rates and forecasts the density of deaths in the life table. Since these values obey a unit sum constraint for both conventional single-decrement life tables (only one absorbing state) and multiple-decrement tables (more than one absorbing state), they are intrinsically relative rather than absolute values across decrements as well as ages. Using the methods of Compositional Data Analysis pioneered by Aitchison (1986), death densities are transformed into the real space so that the full range of multivariate statistics can be applied, then back-transformed to positive values so that the unit sum constraint is honoured. The structure of the best-known, single-decrement mortality-rate forecasting model, devised by Lee and Carter (1992), is expressed in compositional form and the results from the two models are compared. The compositional model is extended to a multiple-decrement form and used to forecast mortality by cause of death for Japan

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The preceding two editions of CoDaWork included talks on the possible consideration of densities as infinite compositions: Egozcue and D´ıaz-Barrero (2003) extended the Euclidean structure of the simplex to a Hilbert space structure of the set of densities within a bounded interval, and van den Boogaart (2005) generalized this to the set of densities bounded by an arbitrary reference density. From the many variations of the Hilbert structures available, we work with three cases. For bounded variables, a basis derived from Legendre polynomials is used. For variables with a lower bound, we standardize them with respect to an exponential distribution and express their densities as coordinates in a basis derived from Laguerre polynomials. Finally, for unbounded variables, a normal distribution is used as reference, and coordinates are obtained with respect to a Hermite-polynomials-based basis. To get the coordinates, several approaches can be considered. A numerical accuracy problem occurs if one estimates the coordinates directly by using discretized scalar products. Thus we propose to use a weighted linear regression approach, where all k- order polynomials are used as predictand variables and weights are proportional to the reference density. Finally, for the case of 2-order Hermite polinomials (normal reference) and 1-order Laguerre polinomials (exponential), one can also derive the coordinates from their relationships to the classical mean and variance. Apart of these theoretical issues, this contribution focuses on the application of this theory to two main problems in sedimentary geology: the comparison of several grain size distributions, and the comparison among different rocks of the empirical distribution of a property measured on a batch of individual grains from the same rock or sediment, like their composition

Relevância:

60.00% 60.00%

Publicador:

Resumo:

A novel metric comparison of the appendicular skeleton (fore and hind limb) of different vertebrates using the Compositional Data Analysis (CDA) methodological approach it’s presented. 355 specimens belonging in various taxa of Dinosauria (Sauropodomorpha, Theropoda, Ornithischia and Aves) and Mammalia (Prothotheria, Metatheria and Eutheria) were analyzed with CDA. A special focus has been put on Sauropodomorpha dinosaurs and the Aitchinson distance has been used as a measure of disparity in limb elements proportions to infer some aspects of functional morphology

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Factor analysis as frequent technique for multivariate data inspection is widely used also for compositional data analysis. The usual way is to use a centered logratio (clr) transformation to obtain the random vector y of dimension D. The factor model is then y = Λf + e (1) with the factors f of dimension k < D, the error term e, and the loadings matrix Λ. Using the usual model assumptions (see, e.g., Basilevsky, 1994), the factor analysis model (1) can be written as Cov(y) = ΛΛT + ψ (2) where ψ = Cov(e) has a diagonal form. The diagonal elements of ψ as well as the loadings matrix Λ are estimated from an estimation of Cov(y). Given observed clr transformed data Y as realizations of the random vector y. Outliers or deviations from the idealized model assumptions of factor analysis can severely effect the parameter estimation. As a way out, robust estimation of the covariance matrix of Y will lead to robust estimates of Λ and ψ in (2), see Pison et al. (2003). Well known robust covariance estimators with good statistical properties, like the MCD or the S-estimators (see, e.g. Maronna et al., 2006), rely on a full-rank data matrix Y which is not the case for clr transformed data (see, e.g., Aitchison, 1986). The isometric logratio (ilr) transformation (Egozcue et al., 2003) solves this singularity problem. The data matrix Y is transformed to a matrix Z by using an orthonormal basis of lower dimension. Using the ilr transformed data, a robust covariance matrix C(Z) can be estimated. The result can be back-transformed to the clr space by C(Y ) = V C(Z)V T where the matrix V with orthonormal columns comes from the relation between the clr and the ilr transformation. Now the parameters in the model (2) can be estimated (Basilevsky, 1994) and the results have a direct interpretation since the links to the original variables are still preserved. The above procedure will be applied to data from geochemistry. Our special interest is on comparing the results with those of Reimann et al. (2002) for the Kola project data

Relevância:

40.00% 40.00%

Publicador:

Resumo:

El trabajo se enmarca conceptualmente en la dimension afectiva de las matematicas, perspectiva desde la que se destaca el papel de las emociones, actitudes, creencias y comportamientos en la construccion del conocimiento y pensamiento matematico

Relevância:

30.00% 30.00%

Publicador:

Resumo:

La Càtedra Lluís Santaló d’Aplicacions de la Matemàtica ha impulsat un cicle de conferències i una exposició que han tingut lloc aquesta tardor a Girona. En el rerefons hi ha hagut, també, la celebració el 20 d’octubre del primer Dia Mundial de l’Estadística

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Es tracta d’un material docent, amb què s’intenta facilitar als estudiants del camp de l’educació l’aprenentatge dels fonaments de l’estadística descriptiva i aplicada. Aquesta publicació vol ajudar a comprendre els conceptes bàsics per utilitzar-los amb seguretat en contextos de recerca i per entendre les dades d’estudis educatius i socials. El document està organitzat en cinc apartats i cada apartat conté una explicació sobre el concepte o conceptes que s’hi treballen, un o dos exemples d’exercicis desenvolupats, propostes d’exercicis i les respostes a aquests exercicis. Finalment, hi ha un sisè capítol amb exercicis de recopilació

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We take stock of the present position of compositional data analysis, of what has been achieved in the last 20 years, and then make suggestions as to what may be sensible avenues of future research. We take an uncompromisingly applied mathematical view, that the challenge of solving practical problems should motivate our theoretical research; and that any new theory should be thoroughly investigated to see if it may provide answers to previously abandoned practical considerations. Indeed a main theme of this lecture will be to demonstrate this applied mathematical approach by a number of challenging examples

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The simplex, the sample space of compositional data, can be structured as a real Euclidean space. This fact allows to work with the coefficients with respect to an orthonormal basis. Over these coefficients we apply standard real analysis, inparticular, we define two different laws of probability trought the density function and we study their main properties

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Traditionally, compositional data has been identified with closed data, and the simplex has been considered as the natural sample space of this kind of data. In our opinion, the emphasis on the constrained nature of compositional data has contributed to mask its real nature. More crucial than the constraining property of compositional data is the scale-invariant property of this kind of data. Indeed, when we are considering only few parts of a full composition we are not working with constrained data but our data are still compositional. We believe that it is necessary to give a more precise definition of composition. This is the aim of this oral contribution

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Compositional random vectors are fundamental tools in the Bayesian analysis of categorical data. Many of the issues that are discussed with reference to the statistical analysis of compositional data have a natural counterpart in the construction of a Bayesian statistical model for categorical data. This note builds on the idea of cross-fertilization of the two areas recommended by Aitchison (1986) in his seminal book on compositional data. Particular emphasis is put on the problem of what parameterization to use

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The classical statistical study of the wind speed in the atmospheric surface layer is made generally from the analysis of the three habitual components that perform the wind data, that is, the component W-E, the component S-N and the vertical component, considering these components independent. When the goal of the study of these data is the Aeolian energy, so is when wind is studied from an energetic point of view and the squares of wind components can be considered as compositional variables. To do so, each component has to be divided by the module of the corresponding vector. In this work the theoretical analysis of the components of the wind as compositional data is presented and also the conclusions that can be obtained from the point of view of the practical applications as well as those that can be derived from the application of this technique in different conditions of weather

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The application of Discriminant function analysis (DFA) is not a new idea in the study of tephrochrology. In this paper, DFA is applied to compositional datasets of two different types of tephras from Mountain Ruapehu in New Zealand and Mountain Rainier in USA. The canonical variables from the analysis are further investigated with a statistical methodology of change-point problems in order to gain a better understanding of the change in compositional pattern over time. Finally, a special case of segmented regression has been proposed to model both the time of change and the change in pattern. This model can be used to estimate the age for the unknown tephras using Bayesian statistical calibration

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Low concentrations of elements in geochemical analyses have the peculiarity of being compositional data and, for a given level of significance, are likely to be beyond the capabilities of laboratories to distinguish between minute concentrations and complete absence, thus preventing laboratories from reporting extremely low concentrations of the analyte. Instead, what is reported is the detection limit, which is the minimum concentration that conclusively differentiates between presence and absence of the element. A spatially distributed exhaustive sample is employed in this study to generate unbiased sub-samples, which are further censored to observe the effect that different detection limits and sample sizes have on the inference of population distributions starting from geochemical analyses having specimens below detection limit (nondetects). The isometric logratio transformation is used to convert the compositional data in the simplex to samples in real space, thus allowing the practitioner to properly borrow from the large source of statistical techniques valid only in real space. The bootstrap method is used to numerically investigate the reliability of inferring several distributional parameters employing different forms of imputation for the censored data. The case study illustrates that, in general, best results are obtained when imputations are made using the distribution best fitting the readings above detection limit and exposes the problems of other more widely used practices. When the sample is spatially correlated, it is necessary to combine the bootstrap with stochastic simulation