399 resultados para inferencia estadística


Relevância:

10.00% 10.00%

Publicador:

Resumo:

The statistical analysis of compositional data is commonly used in geological studies.As is well-known, compositions should be treated using logratios of parts, which aredifficult to use correctly in standard statistical packages. In this paper we describe thenew features of our freeware package, named CoDaPack, which implements most of thebasic statistical methods suitable for compositional data. An example using real data ispresented to illustrate the use of the package

Relevância:

10.00% 10.00%

Publicador:

Resumo:

There are two principal chemical concepts that are important for studying the naturalenvironment. The first one is thermodynamics, which describes whether a system is atequilibrium or can spontaneously change by chemical reactions. The second main conceptis how fast chemical reactions (kinetics or rate of chemical change) take place wheneverthey start. In this work we examine a natural system in which both thermodynamics andkinetic factors are important in determining the abundance of NH+4 , NO−2 and NO−3 insuperficial waters. Samples were collected in the Arno Basin (Tuscany, Italy), a system inwhich natural and antrophic effects both contribute to highly modify the chemical compositionof water. Thermodynamical modelling based on the reduction-oxidation reactionsinvolving the passage NH+4 -& NO−2 -& NO−3 in equilibrium conditions has allowed todetermine the Eh redox potential values able to characterise the state of each sample and,consequently, of the fluid environment from which it was drawn. Just as pH expressesthe concentration of H+ in solution, redox potential is used to express the tendency of anenvironment to receive or supply electrons. In this context, oxic environments, as thoseof river systems, are said to have a high redox potential because O2 is available as anelectron acceptor.Principles of thermodynamics and chemical kinetics allow to obtain a model that oftendoes not completely describe the reality of natural systems. Chemical reactions may indeedfail to achieve equilibrium because the products escape from the site of the rectionor because reactions involving the trasformation are very slow, so that non-equilibriumconditions exist for long periods. Moreover, reaction rates can be sensitive to poorly understoodcatalytic effects or to surface effects, while variables as concentration (a largenumber of chemical species can coexist and interact concurrently), temperature and pressurecan have large gradients in natural systems. By taking into account this, data of 91water samples have been modelled by using statistical methodologies for compositionaldata. The application of log–contrast analysis has allowed to obtain statistical parametersto be correlated with the calculated Eh values. In this way, natural conditions in whichchemical equilibrium is hypothesised, as well as underlying fast reactions, are comparedwith those described by a stochastic approach

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Compositional random vectors are fundamental tools in the Bayesian analysis of categorical data.Many of the issues that are discussed with reference to the statistical analysis of compositionaldata have a natural counterpart in the construction of a Bayesian statistical model for categoricaldata.This note builds on the idea of cross-fertilization of the two areas recommended by Aitchison (1986)in his seminal book on compositional data. Particular emphasis is put on the problem of whatparameterization to use

Relevância:

10.00% 10.00%

Publicador:

Resumo:

At CoDaWork'03 we presented work on the analysis of archaeological glass composi-tional data. Such data typically consist of geochemical compositions involving 10-12variables and approximates completely compositional data if the main component, sil-ica, is included. We suggested that what has been termed `crude' principal componentanalysis (PCA) of standardized data often identi ed interpretable pattern in the datamore readily than analyses based on log-ratio transformed data (LRA). The funda-mental problem is that, in LRA, minor oxides with high relative variation, that maynot be structure carrying, can dominate an analysis and obscure pattern associatedwith variables present at higher absolute levels. We investigate this further using sub-compositional data relating to archaeological glasses found on Israeli sites. A simplemodel for glass-making is that it is based on a `recipe' consisting of two `ingredients',sand and a source of soda. Our analysis focuses on the sub-composition of componentsassociated with the sand source. A `crude' PCA of standardized data shows two clearcompositional groups that can be interpreted in terms of di erent recipes being used atdi erent periods, reected in absolute di erences in the composition. LRA analysis canbe undertaken either by normalizing the data or de ning a `residual'. In either case,after some `tuning', these groups are recovered. The results from the normalized LRAare di erently interpreted as showing that the source of sand used to make the glassdi ered. These results are complementary. One relates to the recipe used. The otherrelates to the composition (and presumed sources) of one of the ingredients. It seemsto be axiomatic in some expositions of LRA that statistical analysis of compositionaldata should focus on relative variation via the use of ratios. Our analysis suggests thatabsolute di erences can also be informative

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The classical statistical study of the wind speed in the atmospheric surface layer is madegenerally from the analysis of the three habitual components that perform the wind data,that is, the component W-E, the component S-N and the vertical component,considering these components independent.When the goal of the study of these data is the Aeolian energy, so is when wind isstudied from an energetic point of view and the squares of wind components can beconsidered as compositional variables. To do so, each component has to be divided bythe module of the corresponding vector.In this work the theoretical analysis of the components of the wind as compositionaldata is presented and also the conclusions that can be obtained from the point of view ofthe practical applications as well as those that can be derived from the application ofthis technique in different conditions of weather

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Precision of released figures is not only an important quality feature of official statistics,it is also essential for a good understanding of the data. In this paper we show a casestudy of how precision could be conveyed if the multivariate nature of data has to betaken into account. In the official release of the Swiss earnings structure survey, the totalsalary is broken down into several wage components. We follow Aitchison's approachfor the analysis of compositional data, which is based on logratios of components. Wefirst present diferent multivariate analyses of the compositional data whereby the wagecomponents are broken down by economic activity classes. Then we propose a numberof ways to assess precision

Relevância:

10.00% 10.00%

Publicador:

Resumo:

A study of tin deposits from Priamurye (Russia) is performed to analyze the differencesbetween them based on their origin and also on commercial criteria. A particularanalysis based on their vertical zonality is also given for samples from Solnechnoedeposit. All the statistical analysis are made on the subcomposition formed by seventrace elements in cassiterite (In, Sc, Be, W, Nb, Ti and V) using the Aitchison’methodology of analysis of compositional data

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Low concentrations of elements in geochemical analyses have the peculiarity of beingcompositional data and, for a given level of significance, are likely to be beyond thecapabilities of laboratories to distinguish between minute concentrations and completeabsence, thus preventing laboratories from reporting extremely low concentrations of theanalyte. Instead, what is reported is the detection limit, which is the minimumconcentration that conclusively differentiates between presence and absence of theelement. A spatially distributed exhaustive sample is employed in this study to generateunbiased sub-samples, which are further censored to observe the effect that differentdetection limits and sample sizes have on the inference of population distributionsstarting from geochemical analyses having specimens below detection limit (nondetects).The isometric logratio transformation is used to convert the compositional data in thesimplex to samples in real space, thus allowing the practitioner to properly borrow fromthe large source of statistical techniques valid only in real space. The bootstrap method isused to numerically investigate the reliability of inferring several distributionalparameters employing different forms of imputation for the censored data. The casestudy illustrates that, in general, best results are obtained when imputations are madeusing the distribution best fitting the readings above detection limit and exposes theproblems of other more widely used practices. When the sample is spatially correlated, itis necessary to combine the bootstrap with stochastic simulation

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The statistical analysis of compositional data should be treated using logratios of parts,which are difficult to use correctly in standard statistical packages. For this reason afreeware package, named CoDaPack was created. This software implements most of thebasic statistical methods suitable for compositional data.In this paper we describe the new version of the package that now is calledCoDaPack3D. It is developed in Visual Basic for applications (associated with Excel©),Visual Basic and Open GL, and it is oriented towards users with a minimum knowledgeof computers with the aim at being simple and easy to use.This new version includes new graphical output in 2D and 3D. These outputs could bezoomed and, in 3D, rotated. Also a customization menu is included and outputs couldbe saved in jpeg format. Also this new version includes an interactive help and alldialog windows have been improved in order to facilitate its use.To use CoDaPack one has to access Excel© and introduce the data in a standardspreadsheet. These should be organized as a matrix where Excel© rows correspond tothe observations and columns to the parts. The user executes macros that returnnumerical or graphical results. There are two kinds of numerical results: new variablesand descriptive statistics, and both appear on the same sheet. Graphical output appearsin independent windows. In the present version there are 8 menus, with a total of 38submenus which, after some dialogue, directly call the corresponding macro. Thedialogues ask the user to input variables and further parameters needed, as well aswhere to put these results. The web site http://ima.udg.es/CoDaPack contains thisfreeware package and only Microsoft Excel© under Microsoft Windows© is required torun the software.Kew words: Compositional data Analysis, Software

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The R-package “compositions”is a tool for advanced compositional analysis. Its basicfunctionality has seen some conceptual improvement, containing now some facilitiesto work with and represent ilr bases built from balances, and an elaborated subsys-tem for dealing with several kinds of irregular data: (rounded or structural) zeroes,incomplete observations and outliers. The general approach to these irregularities isbased on subcompositions: for an irregular datum, one can distinguish a “regular” sub-composition (where all parts are actually observed and the datum behaves typically)and a “problematic” subcomposition (with those unobserved, zero or rounded parts, orelse where the datum shows an erratic or atypical behaviour). Systematic classificationschemes are proposed for both outliers and missing values (including zeros) focusing onthe nature of irregularities in the datum subcomposition(s).To compute statistics with values missing at random and structural zeros, a projectionapproach is implemented: a given datum contributes to the estimation of the desiredparameters only on the subcompositon where it was observed. For data sets withvalues below the detection limit, two different approaches are provided: the well-knownimputation technique, and also the projection approach.To compute statistics in the presence of outliers, robust statistics are adapted to thecharacteristics of compositional data, based on the minimum covariance determinantapproach. The outlier classification is based on four different models of outlier occur-rence and Monte-Carlo-based tests for their characterization. Furthermore the packageprovides special plots helping to understand the nature of outliers in the dataset.Keywords: coda-dendrogram, lost values, MAR, missing data, MCD estimator,robustness, rounded zeros

Relevância:

10.00% 10.00%

Publicador:

Resumo:

A compositional time series is obtained when a compositional data vector is observed atdifferent points in time. Inherently, then, a compositional time series is a multivariatetime series with important constraints on the variables observed at any instance in time.Although this type of data frequently occurs in situations of real practical interest, atrawl through the statistical literature reveals that research in the field is very much in itsinfancy and that many theoretical and empirical issues still remain to be addressed. Anyappropriate statistical methodology for the analysis of compositional time series musttake into account the constraints which are not allowed for by the usual statisticaltechniques available for analysing multivariate time series. One general approach toanalyzing compositional time series consists in the application of an initial transform tobreak the positive and unit sum constraints, followed by the analysis of the transformedtime series using multivariate ARIMA models. In this paper we discuss the use of theadditive log-ratio, centred log-ratio and isometric log-ratio transforms. We also presentresults from an empirical study designed to explore how the selection of the initialtransform affects subsequent multivariate ARIMA modelling as well as the quality ofthe forecasts

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Simpson's paradox, also known as amalgamation or aggregation paradox, appears whendealing with proportions. Proportions are by construction parts of a whole, which canbe interpreted as compositions assuming they only carry relative information. TheAitchison inner product space structure of the simplex, the sample space of compositions, explains the appearance of the paradox, given that amalgamation is a nonlinearoperation within that structure. Here we propose to use balances, which are specificelements of this structure, to analyse situations where the paradox might appear. Withthe proposed approach we obtain that the centre of the tables analysed is a naturalway to compare them, which avoids by construction the possibility of a paradox.Key words: Aitchison geometry, geometric mean, orthogonal projection

Relevância:

10.00% 10.00%

Publicador:

Resumo:

A condition needed for testing nested hypotheses from a Bayesianviewpoint is that the prior for the alternative model concentratesmass around the small, or null, model. For testing independencein contingency tables, the intrinsic priors satisfy this requirement.Further, the degree of concentration of the priors is controlled bya discrete parameter m, the training sample size, which plays animportant role in the resulting answer regardless of the samplesize.In this paper we study robustness of the tests of independencein contingency tables with respect to the intrinsic priors withdifferent degree of concentration around the null, and comparewith other “robust” results by Good and Crook. Consistency ofthe intrinsic Bayesian tests is established.We also discuss conditioning issues and sampling schemes,and argue that conditioning should be on either one margin orthe table total, but not on both margins.Examples using real are simulated data are given

Relevância:

10.00% 10.00%

Publicador:

Resumo:

L’estudi de la criminalitat és multifactorial, cal destriar en la mesura del possible els factorsambientals, d’entorn urbà, que poden jugar el paper de facilitadors de la delinqüència. Lestipologies delictives són diverses i amb motivacions "mòbils" diferenciats, el que confereixdificultats a l’estudi de la criminalitat, és per aquesta raó que es trien dues tipologies derobatoris contra el patrimoni per ser analitzats: robatoris amb força a interior de domicili(vivenda) i els furts, en un marc de ciutat petita-mitjana, com és el cas de Girona. Elconeixement per part de l’autor del marc urbà n’és un part essencial per desenvolupar l’estudi.Un dels objectius finalistes és aconseguir avançar en estudis de diagnosi criminal de les ciutats;així com en el coneixement dels patrons espaials que tenen les tipologies delictives i lainfluència de les característiques del disseny urbà sobre la criminalitat. La metodologia ques’implementarà serà: recerca bibliogràfica, estadística descriptiva i inferencial, sistemesd’informació geogràfica, així com la deducció a partir de la visualització de la cartografiagenerada i un treball de camp de les zones d’alta ocurrència delictiva dels delictes esmentats.Pel que fa a les hipòtesis principals: els robatoris contra el patrimoni no es concentren en àreesdegradades urbanísticament sinó ans el contrari en ambients urbans cuidats. Aquesta hipòtesies contraposa a la teoria criminal de la "Broken Windows", que esmenta que els espais urbansdegradats és on hi ha menys ocupació d’espai públic i més delinqüència. Una altra hipòtesiimportant és que els espais percebuts com a segurs, respecte les tipologies delictivesesmentades, són els més insegurs

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Objectiu: provar que, enfront de l’aparició de sibilàncies, l’alletament matern es comporta com a un factor protector i l’alletament artificial com a un factor inductor. Material i mètodes: assaig clínic controlat, randomitzat, a doble cec amb grup control i seguiment de 8 anys, de la submostra espanyola, en el seu 5è any de seguiment, del treball multicèntric europeu EU CHILDHOOD OBESITY PROGRAMME (QLK1-2001-00389). La població es va dividir en 3 grups: nadons alimentats amb lactància artificial amb baix contingut proteic, nadons alimentats amb lactància artificial amb alt contingut proteic i un grup control de nadons alimentats amb llet materna. Per avaluar l’aparició de sibilàncies i la seva evolució en el temps es van realitzar entrevistes als pares a mesura que la població assolia els 6 anys de vida sobre qüestions referides als 3 i als 6 anys i s’havien de realitzar entrevistes als 8 anys de vida sobre qüestions referdies a aquesta mateixa edat. Per comprovar la repercussió en la funció pulmonar i valorar la base atòpica, es tenia previst realitzar, als 8 anys, espirometria, prik test amb aeroalergens, determinació de IgE sèrica total i quantificació dels eosinòfils en sang perifèrica. S’han valorat possibles factors de confusió com antecedents familiars de malalties de base al•lèrgica, nivell socioeconòmic familiar, factors, ambient epidemiològic i s’ha estudiat altra morbiditat associada com episodis de febre, vòmits, diarrea, dermatitis atòpica, refredat de vies respiratòries altes i prescripció mèdica d’antibiòtics. Resultats: només un 20’8% van rebre alletament matern. No s’han trobat diferències estadísticament significatives entre la història d’episodis de sibilàncies i el tipus d’alletament rebut. Tampoc s’han trobat diferències estadísticament significatives entre l’alimentació rebuda i la història de dermatitis atòpica. La llet artificial es va associar, amb significació estadística, a una major prescripció d’antibiòtics i una major incidència de patir diarrees i, sense significació estadística, es va associar a un augment del risc de patir RVA. La lactància materna es va associar amb significació estadística a una menor prescripció d’antibiòtics. La presència de germans grans i un baix nivell d’educació de la mare van contribuir a augmentar la morbiditat durant el primer any de vida. El consum d’alcohol durant l’embaràs es va associar a més episodis de vòmits i el consum de tabac a més episodis de diarrea. Conclusions: l’alletament artificial no predisposa a patir més episodis de sibilàncies ni de dermatitis atòpica. La lactància materna exclusiva durant almenys 3 mesos disminueix el risc de diarrees en els primers 6 mesos de vida i retarda l’aparició d’infeccions aparentment bacterianes que requereixen tractament antibiòtic. L’alletament matern exclusiu durant un mínim de tres mesos no comporta una substancial disminució de la morbiditat durant els primers 12 mesos de vida.