173 resultados para Metamorphism (Geology)
Resumo:
In human Population Genetics, routine applications of principal component techniques are oftenrequired. Population biologists make widespread use of certain discrete classifications of humansamples into haplotypes, the monophyletic units of phylogenetic trees constructed from severalsingle nucleotide bimorphisms hierarchically ordered. Compositional frequencies of the haplotypesare recorded within the different samples. Principal component techniques are then required as adimension-reducing strategy to bring the dimension of the problem to a manageable level, say two,to allow for graphical analysis.Population biologists at large are not aware of the special features of compositional data and normally make use of the crude covariance of compositional relative frequencies to construct principalcomponents. In this short note we present our experience with using traditional linear principalcomponents or compositional principal components based on logratios, with reference to a specificdataset
Resumo:
The classical statistical study of the wind speed in the atmospheric surface layer is madegenerally from the analysis of the three habitual components that perform the wind data,that is, the component W-E, the component S-N and the vertical component,considering these components independent.When the goal of the study of these data is the Aeolian energy, so is when wind isstudied from an energetic point of view and the squares of wind components can beconsidered as compositional variables. To do so, each component has to be divided bythe module of the corresponding vector.In this work the theoretical analysis of the components of the wind as compositionaldata is presented and also the conclusions that can be obtained from the point of view ofthe practical applications as well as those that can be derived from the application ofthis technique in different conditions of weather
Resumo:
The main instrument used in psychological measurement is the self-report questionnaire. One of its majordrawbacks however is its susceptibility to response biases. A known strategy to control these biases hasbeen the use of so-called ipsative items. Ipsative items are items that require the respondent to makebetween-scale comparisons within each item. The selected option determines to which scale the weight ofthe answer is attributed. Consequently in questionnaires only consisting of ipsative items everyrespondent is allotted an equal amount, i.e. the total score, that each can distribute differently over thescales. Therefore this type of response format yields data that can be considered compositional from itsinception.Methodological oriented psychologists have heavily criticized this type of item format, since the resultingdata is also marked by the associated unfavourable statistical properties. Nevertheless, clinicians havekept using these questionnaires to their satisfaction. This investigation therefore aims to evaluate bothpositions and addresses the similarities and differences between the two data collection methods. Theultimate objective is to formulate a guideline when to use which type of item format.The comparison is based on data obtained with both an ipsative and normative version of threepsychological questionnaires, which were administered to 502 first-year students in psychology accordingto a balanced within-subjects design. Previous research only compared the direct ipsative scale scoreswith the derived ipsative scale scores. The use of compositional data analysis techniques also enables oneto compare derived normative score ratios with direct normative score ratios. The addition of the secondcomparison not only offers the advantage of a better-balanced research strategy. In principle it also allowsfor parametric testing in the evaluation
Resumo:
Most of economic literature has presented its analysis under the assumption of homogeneous capital stock.However, capital composition differs across countries. What has been the pattern of capital compositionassociated with World economies? We make an exploratory statistical analysis based on compositional datatransformed by Aitchinson logratio transformations and we use tools for visualizing and measuring statisticalestimators of association among the components. The goal is to detect distinctive patterns in the composition.As initial findings could be cited that:1. Sectorial components behaved in a correlated way, building industries on one side and , in a lessclear view, equipment industries on the other.2. Full sample estimation shows a negative correlation between durable goods component andother buildings component and between transportation and building industries components.3. Countries with zeros in some components are mainly low income countries at the bottom of theincome category and behaved in a extreme way distorting main results observed in the fullsample.4. After removing these extreme cases, conclusions seem not very sensitive to the presence ofanother isolated cases
Resumo:
The statistical analysis of literary style is the part of stylometry that compares measurable characteristicsin a text that are rarely controlled by the author, with those in other texts. When thegoal is to settle authorship questions, these characteristics should relate to the author’s style andnot to the genre, epoch or editor, and they should be such that their variation between authors islarger than the variation within comparable texts from the same author.For an overview of the literature on stylometry and some of the techniques involved, see for exampleMosteller and Wallace (1964, 82), Herdan (1964), Morton (1978), Holmes (1985), Oakes (1998) orLebart, Salem and Berry (1998).Tirant lo Blanc, a chivalry book, is the main work in catalan literature and it was hailed to be“the best book of its kind in the world” by Cervantes in Don Quixote. Considered by writterslike Vargas Llosa or Damaso Alonso to be the first modern novel in Europe, it has been translatedseveral times into Spanish, Italian and French, with modern English translations by Rosenthal(1996) and La Fontaine (1993). The main body of this book was written between 1460 and 1465,but it was not printed until 1490.There is an intense and long lasting debate around its authorship sprouting from its first edition,where its introduction states that the whole book is the work of Martorell (1413?-1468), while atthe end it is stated that the last one fourth of the book is by Galba (?-1490), after the death ofMartorell. Some of the authors that support the theory of single authorship are Riquer (1990),Chiner (1993) and Badia (1993), while some of those supporting the double authorship are Riquer(1947), Coromines (1956) and Ferrando (1995). For an overview of this debate, see Riquer (1990).Neither of the two candidate authors left any text comparable to the one under study, and thereforediscriminant analysis can not be used to help classify chapters by author. By using sample textsencompassing about ten percent of the book, and looking at word length and at the use of 44conjunctions, prepositions and articles, Ginebra and Cabos (1998) detect heterogeneities that mightindicate the existence of two authors. By analyzing the diversity of the vocabulary, Riba andGinebra (2000) estimates that stylistic boundary to be near chapter 383.Following the lead of the extensive literature, this paper looks into word length, the use of the mostfrequent words and into the use of vowels in each chapter of the book. Given that the featuresselected are categorical, that leads to three contingency tables of ordered rows and therefore tothree sequences of multinomial observations.Section 2 explores these sequences graphically, observing a clear shift in their distribution. Section 3describes the problem of the estimation of a suden change-point in those sequences, in the followingsections we propose various ways to estimate change-points in multinomial sequences; the methodin section 4 involves fitting models for polytomous data, the one in Section 5 fits gamma modelsonto the sequence of Chi-square distances between each row profiles and the average profile, theone in Section 6 fits models onto the sequence of values taken by the first component of thecorrespondence analysis as well as onto sequences of other summary measures like the averageword length. In Section 7 we fit models onto the marginal binomial sequences to identify thefeatures that distinguish the chapters before and after that boundary. Most methods rely heavilyon the use of generalized linear models
Resumo:
In several computer graphics areas, a refinement criterion is often needed to decide whether to goon or to stop sampling a signal. When the sampled values are homogeneous enough, we assume thatthey represent the signal fairly well and we do not need further refinement, otherwise more samples arerequired, possibly with adaptive subdivision of the domain. For this purpose, a criterion which is verysensitive to variability is necessary. In this paper, we present a family of discrimination measures, thef-divergences, meeting this requirement. These convex functions have been well studied and successfullyapplied to image processing and several areas of engineering. Two applications to global illuminationare shown: oracles for hierarchical radiosity and criteria for adaptive refinement in ray-tracing. Weobtain significantly better results than with classic criteria, showing that f-divergences are worth furtherinvestigation in computer graphics. Also a discrimination measure based on entropy of the samples forrefinement in ray-tracing is introduced. The recursive decomposition of entropy provides us with a naturalmethod to deal with the adaptive subdivision of the sampling region
Resumo:
Usually, psychometricians apply classical factorial analysis to evaluate construct validity of order rankscales. Nevertheless, these scales have particular characteristics that must be taken into account: totalscores and rank are highly relevant
Resumo:
Precision of released figures is not only an important quality feature of official statistics,it is also essential for a good understanding of the data. In this paper we show a casestudy of how precision could be conveyed if the multivariate nature of data has to betaken into account. In the official release of the Swiss earnings structure survey, the totalsalary is broken down into several wage components. We follow Aitchison's approachfor the analysis of compositional data, which is based on logratios of components. Wefirst present diferent multivariate analyses of the compositional data whereby the wagecomponents are broken down by economic activity classes. Then we propose a numberof ways to assess precision
Resumo:
Two contrasting case studies of sediment and detrital mineral composition are investigated in order to outline interactions between chemical composition and grain size. Modern glacial sediments exhibit a strong dependence of the two parameters due to the preferential enrichment of mafic minerals, especially biotite, in the fine-grained fractions. On the other hand, the composition of detrital heavy minerals (here: rutile) appears to be not systematically related to grain-size, but is strongly controlled by location, i.e. the petrology of the source rocks of detrital grains. This supports the use of rutile as a well-suited tracer mineral for provenance studies. The results further suggest that (i) interpretations derived from whole-rock sediment geochemistry should be flanked by grain-size observations, and (ii) a more sound statistical evaluation of these interactions require the development of new tailor-made statistical tools to deal with such so-called two-way compositions
Resumo:
A study of tin deposits from Priamurye (Russia) is performed to analyze the differencesbetween them based on their origin and also on commercial criteria. A particularanalysis based on their vertical zonality is also given for samples from Solnechnoedeposit. All the statistical analysis are made on the subcomposition formed by seventrace elements in cassiterite (In, Sc, Be, W, Nb, Ti and V) using the Aitchison’methodology of analysis of compositional data
Resumo:
Low concentrations of elements in geochemical analyses have the peculiarity of beingcompositional data and, for a given level of significance, are likely to be beyond thecapabilities of laboratories to distinguish between minute concentrations and completeabsence, thus preventing laboratories from reporting extremely low concentrations of theanalyte. Instead, what is reported is the detection limit, which is the minimumconcentration that conclusively differentiates between presence and absence of theelement. A spatially distributed exhaustive sample is employed in this study to generateunbiased sub-samples, which are further censored to observe the effect that differentdetection limits and sample sizes have on the inference of population distributionsstarting from geochemical analyses having specimens below detection limit (nondetects).The isometric logratio transformation is used to convert the compositional data in thesimplex to samples in real space, thus allowing the practitioner to properly borrow fromthe large source of statistical techniques valid only in real space. The bootstrap method isused to numerically investigate the reliability of inferring several distributionalparameters employing different forms of imputation for the censored data. The casestudy illustrates that, in general, best results are obtained when imputations are madeusing the distribution best fitting the readings above detection limit and exposes theproblems of other more widely used practices. When the sample is spatially correlated, itis necessary to combine the bootstrap with stochastic simulation
Resumo:
Isotopic data are currently becoming an important source of information regardingsources, evolution and mixing processes of water in hydrogeologic systems. However, itis not clear how to treat with statistics the geochemical data and the isotopic datatogether. We propose to introduce the isotopic information as new parts, and applycompositional data analysis with the resulting increased composition. Results areequivalent to downscale the classical isotopic delta variables, because they are alreadyrelative (as needed in the compositional framework) and isotopic variations are almostalways very small. This methodology is illustrated and tested with the study of theLlobregat River Basin (Barcelona, NE Spain), where it is shown that, though verysmall, isotopic variations comp lement geochemical principal components, and help inthe better identification of pollution sources
Resumo:
This paper examines a dataset which is modeled well by thePoisson-Log Normal process and by this process mixed with LogNormal data, which are both turned into compositions. Thisgenerates compositional data that has zeros without any need forconditional models or assuming that there is missing or censoreddata that needs adjustment. It also enables us to model dependenceon covariates and within the composition
Resumo:
The statistical analysis of compositional data should be treated using logratios of parts,which are difficult to use correctly in standard statistical packages. For this reason afreeware package, named CoDaPack was created. This software implements most of thebasic statistical methods suitable for compositional data.In this paper we describe the new version of the package that now is calledCoDaPack3D. It is developed in Visual Basic for applications (associated with Excel©),Visual Basic and Open GL, and it is oriented towards users with a minimum knowledgeof computers with the aim at being simple and easy to use.This new version includes new graphical output in 2D and 3D. These outputs could bezoomed and, in 3D, rotated. Also a customization menu is included and outputs couldbe saved in jpeg format. Also this new version includes an interactive help and alldialog windows have been improved in order to facilitate its use.To use CoDaPack one has to access Excel© and introduce the data in a standardspreadsheet. These should be organized as a matrix where Excel© rows correspond tothe observations and columns to the parts. The user executes macros that returnnumerical or graphical results. There are two kinds of numerical results: new variablesand descriptive statistics, and both appear on the same sheet. Graphical output appearsin independent windows. In the present version there are 8 menus, with a total of 38submenus which, after some dialogue, directly call the corresponding macro. Thedialogues ask the user to input variables and further parameters needed, as well aswhere to put these results. The web site http://ima.udg.es/CoDaPack contains thisfreeware package and only Microsoft Excel© under Microsoft Windows© is required torun the software.Kew words: Compositional data Analysis, Software
Resumo:
The R-package “compositions”is a tool for advanced compositional analysis. Its basicfunctionality has seen some conceptual improvement, containing now some facilitiesto work with and represent ilr bases built from balances, and an elaborated subsys-tem for dealing with several kinds of irregular data: (rounded or structural) zeroes,incomplete observations and outliers. The general approach to these irregularities isbased on subcompositions: for an irregular datum, one can distinguish a “regular” sub-composition (where all parts are actually observed and the datum behaves typically)and a “problematic” subcomposition (with those unobserved, zero or rounded parts, orelse where the datum shows an erratic or atypical behaviour). Systematic classificationschemes are proposed for both outliers and missing values (including zeros) focusing onthe nature of irregularities in the datum subcomposition(s).To compute statistics with values missing at random and structural zeros, a projectionapproach is implemented: a given datum contributes to the estimation of the desiredparameters only on the subcompositon where it was observed. For data sets withvalues below the detection limit, two different approaches are provided: the well-knownimputation technique, and also the projection approach.To compute statistics in the presence of outliers, robust statistics are adapted to thecharacteristics of compositional data, based on the minimum covariance determinantapproach. The outlier classification is based on four different models of outlier occur-rence and Monte-Carlo-based tests for their characterization. Furthermore the packageprovides special plots helping to understand the nature of outliers in the dataset.Keywords: coda-dendrogram, lost values, MAR, missing data, MCD estimator,robustness, rounded zeros