975 resultados para Estadística matemàtica -- Informàtica


Relevância:

100.00% 100.00%

Publicador:

Resumo:

The application of compositional data analysis through log ratio trans- formations corresponds to a multinomial logit model for the shares themselves. This model is characterized by the property of Independence of Irrelevant Alter- natives (IIA). IIA states that the odds ratio in this case the ratio of shares is invariant to the addition or deletion of outcomes to the problem. It is exactly this invariance of the ratio that underlies the commonly used zero replacement procedure in compositional data analysis. In this paper we investigate using the nested logit model that does not embody IIA and an associated zero replacement procedure and compare its performance with that of the more usual approach of using the multinomial logit model. Our comparisons exploit a data set that com- bines voting data by electoral division with corresponding census data for each division for the 2001 Federal election in Australia

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Examples of compositional data. The simplex, a suitable sample space for compositional data and Aitchison's geometry. R, a free language and environment for statistical computing and graphics

Relevância:

100.00% 100.00%

Publicador:

Resumo:

All of the imputation techniques usually applied for replacing values below the detection limit in compositional data sets have adverse effects on the variability. In this work we propose a modification of the EM algorithm that is applied using the additive log-ratio transformation. This new strategy is applied to a compositional data set and the results are compared with the usual imputation techniques

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In the eighties, John Aitchison (1986) developed a new methodological approach for the statistical analysis of compositional data. This new methodology was implemented in Basic routines grouped under the name CODA and later NEWCODA inMatlab (Aitchison, 1997). After that, several other authors have published extensions to this methodology: Marín-Fernández and others (2000), Barceló-Vidal and others (2001), Pawlowsky-Glahn and Egozcue (2001, 2002) and Egozcue and others (2003). (...)

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Compositional data naturally arises from the scientific analysis of the chemical composition of archaeological material such as ceramic and glass artefacts. Data of this type can be explored using a variety of techniques, from standard multivariate methods such as principal components analysis and cluster analysis, to methods based upon the use of log-ratios. The general aim is to identify groups of chemically similar artefacts that could potentially be used to answer questions of provenance. This paper will demonstrate work in progress on the development of a documented library of methods, implemented using the statistical package R, for the analysis of compositional data. R is an open source package that makes available very powerful statistical facilities at no cost. We aim to show how, with the aid of statistical software such as R, traditional exploratory multivariate analysis can easily be used alongside, or in combination with, specialist techniques of compositional data analysis. The library has been developed from a core of basic R functionality, together with purpose-written routines arising from our own research (for example that reported at CoDaWork'03). In addition, we have included other appropriate publicly available techniques and libraries that have been implemented in R by other authors. Available functions range from standard multivariate techniques through to various approaches to log-ratio analysis and zero replacement. We also discuss and demonstrate a small selection of relatively new techniques that have hitherto been little-used in archaeometric applications involving compositional data. The application of the library to the analysis of data arising in archaeometry will be demonstrated; results from different analyses will be compared; and the utility of the various methods discussed

Relevância:

100.00% 100.00%

Publicador:

Resumo:

”compositions” is a new R-package for the analysis of compositional and positive data. It contains four classes corresponding to the four different types of compositional and positive geometry (including the Aitchison geometry). It provides means for computation, plotting and high-level multivariate statistical analysis in all four geometries. These geometries are treated in an fully analogous way, based on the principle of working in coordinates, and the object-oriented programming paradigm of R. In this way, called functions automatically select the most appropriate type of analysis as a function of the geometry. The graphical capabilities include ternary diagrams and tetrahedrons, various compositional plots (boxplots, barplots, piecharts) and extensive graphical tools for principal components. Afterwards, ortion and proportion lines, straight lines and ellipses in all geometries can be added to plots. The package is accompanied by a hands-on-introduction, documentation for every function, demos of the graphical capabilities and plenty of usage examples. It allows direct and parallel computation in all four vector spaces and provides the beginner with a copy-and-paste style of data analysis, while letting advanced users keep the functionality and customizability they demand of R, as well as all necessary tools to add own analysis routines. A complete example is included in the appendix

Relevância:

100.00% 100.00%

Publicador:

Resumo:

R from http://www.r-project.org/ is ‘GNU S’ – a language and environment for statistical computing and graphics. The environment in which many classical and modern statistical techniques have been implemented, but many are supplied as packages. There are 8 standard packages and many more are available through the cran family of Internet sites http://cran.r-project.org . We started to develop a library of functions in R to support the analysis of mixtures and our goal is a MixeR package for compositional data analysis that provides support for operations on compositions: perturbation and power multiplication, subcomposition with or without residuals, centering of the data, computing Aitchison’s, Euclidean, Bhattacharyya distances, compositional Kullback-Leibler divergence etc. graphical presentation of compositions in ternary diagrams and tetrahedrons with additional features: barycenter, geometric mean of the data set, the percentiles lines, marking and coloring of subsets of the data set, theirs geometric means, notation of individual data in the set . . . dealing with zeros and missing values in compositional data sets with R procedures for simple and multiplicative replacement strategy, the time series analysis of compositional data. We’ll present the current status of MixeR development and illustrate its use on selected data sets

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The statistical analysis of compositional data is commonly used in geological studies. As is well-known, compositions should be treated using logratios of parts, which are difficult to use correctly in standard statistical packages. In this paper we describe the new features of our freeware package, named CoDaPack, which implements most of the basic statistical methods suitable for compositional data. An example using real data is presented to illustrate the use of the package

Relevância:

100.00% 100.00%

Publicador:

Resumo:

There are two principal chemical concepts that are important for studying the natural environment. The first one is thermodynamics, which describes whether a system is at equilibrium or can spontaneously change by chemical reactions. The second main concept is how fast chemical reactions (kinetics or rate of chemical change) take place whenever they start. In this work we examine a natural system in which both thermodynamics and kinetic factors are important in determining the abundance of NH+4 , NO−2 and NO−3 in superficial waters. Samples were collected in the Arno Basin (Tuscany, Italy), a system in which natural and antrophic effects both contribute to highly modify the chemical composition of water. Thermodynamical modelling based on the reduction-oxidation reactions involving the passage NH+4 -> NO−2 -> NO−3 in equilibrium conditions has allowed to determine the Eh redox potential values able to characterise the state of each sample and, consequently, of the fluid environment from which it was drawn. Just as pH expresses the concentration of H+ in solution, redox potential is used to express the tendency of an environment to receive or supply electrons. In this context, oxic environments, as those of river systems, are said to have a high redox potential because O2 is available as an electron acceptor. Principles of thermodynamics and chemical kinetics allow to obtain a model that often does not completely describe the reality of natural systems. Chemical reactions may indeed fail to achieve equilibrium because the products escape from the site of the rection or because reactions involving the trasformation are very slow, so that non-equilibrium conditions exist for long periods. Moreover, reaction rates can be sensitive to poorly understood catalytic effects or to surface effects, while variables as concentration (a large number of chemical species can coexist and interact concurrently), temperature and pressure can have large gradients in natural systems. By taking into account this, data of 91 water samples have been modelled by using statistical methodologies for compositional data. The application of log–contrast analysis has allowed to obtain statistical parameters to be correlated with the calculated Eh values. In this way, natural conditions in which chemical equilibrium is hypothesised, as well as underlying fast reactions, are compared with those described by a stochastic approach

Relevância:

100.00% 100.00%

Publicador:

Resumo:

First application of compositional data analysis techniques to Australian election data

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The R-package “compositions”is a tool for advanced compositional analysis. Its basic functionality has seen some conceptual improvement, containing now some facilities to work with and represent ilr bases built from balances, and an elaborated subsys- tem for dealing with several kinds of irregular data: (rounded or structural) zeroes, incomplete observations and outliers. The general approach to these irregularities is based on subcompositions: for an irregular datum, one can distinguish a “regular” sub- composition (where all parts are actually observed and the datum behaves typically) and a “problematic” subcomposition (with those unobserved, zero or rounded parts, or else where the datum shows an erratic or atypical behaviour). Systematic classification schemes are proposed for both outliers and missing values (including zeros) focusing on the nature of irregularities in the datum subcomposition(s). To compute statistics with values missing at random and structural zeros, a projection approach is implemented: a given datum contributes to the estimation of the desired parameters only on the subcompositon where it was observed. For data sets with values below the detection limit, two different approaches are provided: the well-known imputation technique, and also the projection approach. To compute statistics in the presence of outliers, robust statistics are adapted to the characteristics of compositional data, based on the minimum covariance determinant approach. The outlier classification is based on four different models of outlier occur- rence and Monte-Carlo-based tests for their characterization. Furthermore the package provides special plots helping to understand the nature of outliers in the dataset. Keywords: coda-dendrogram, lost values, MAR, missing data, MCD estimator, robustness, rounded zeros

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq)

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Pós-graduação em Educação Matemática - IGCE

Relevância:

90.00% 90.00%

Publicador:

Resumo:

El Tratado de Estadística de Olegario Fernández Baños fue el primer libro de Estadística Matemática en sentido moderno que se publicó en España. Anteriormente, se habían publicado libros de estadística para la asignatura de geógrafa y estadística industrial y Mercantil de las Escuelas de Comercio y para la de Economía Política de las Facultades de Derecho. tos libros de texto para esas asignaturas trataban, generalmente, temas de carácter administrativa, descripción de los métodos estadísticos utilizados y aplicación de la Estadística a España.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Observations in daily practice are sometimes registered as positive values larger then a given threshold α. The sample space is in this case the interval (α,+∞), α > 0, which can be structured as a real Euclidean space in different ways. This fact opens the door to alternative statistical models depending not only on the assumed distribution function, but also on the metric which is considered as appropriate, i.e. the way differences are measured, and thus variability