994 resultados para compositional data,
Resumo:
The Dirichlet family owes its privileged status within simplex distributions to easyness of interpretation and good mathematical properties. In particular, we recall fundamental properties for the analysis of compositional data such as closure under amalgamation and subcomposition. From a probabilistic point of view, it is characterised (uniquely) by a variety of independence relationships which makes it indisputably the reference model for expressing the non trivial idea of substantial independence for compositions. Indeed, its well known inadequacy as a general model for compositional data stems from such an independence structure together with the poorness of its parametrisation. In this paper a new class of distributions (called Flexible Dirichlet) capable of handling various dependence structures and containing the Dirichlet as a special case is presented. The new model exhibits a considerably richer parametrisation which, for example, allows to model the means and (part of) the variance-covariance matrix separately. Moreover, such a model preserves some good mathematical properties of the Dirichlet, i.e. closure under amalgamation and subcomposition with new parameters simply related to the parent composition parameters. Furthermore, the joint and conditional distributions of subcompositions and relative totals can be expressed as simple mixtures of two Flexible Dirichlet distributions. The basis generating the Flexible Dirichlet, though keeping compositional invariance, shows a dependence structure which allows various forms of partitional dependence to be contemplated by the model (e.g. non-neutrality, subcompositional dependence and subcompositional non-invariance), independence cases being identified by suitable parameter configurations. In particular, within this model substantial independence among subsets of components of the composition naturally occurs when the subsets have a Dirichlet distribution
Resumo:
The identification of compositional changes in fumarolic gases of active and quiescent volcanoes is one of the most important targets in monitoring programs. From a general point of view, many systematic (often cyclic) and random processes control the chemistry of gas discharges, making difficult to produce a convincing mathematical-statistical modelling. Changes in the chemical composition of volcanic gases sampled at Vulcano Island (Aeolian Arc, Sicily, Italy) from eight different fumaroles located in the northern sector of the summit crater (La Fossa) have been analysed by considering their dependence from time in the period 2000-2007. Each intermediate chemical composition has been considered as potentially derived from the contribution of the two temporal extremes represented by the 2000 and 2007 samples, respectively, by using inverse modelling methodologies for compositional data. Data pertaining to fumaroles F5 and F27, located on the rim and in the inner part of La Fossa crater, respectively, have been used to achieve the proposed aim. The statistical approach has allowed us to highlight the presence of random and not random fluctuations, features useful to understand how the volcanic system works, opening new perspectives in sampling strategies and in the evaluation of the natural risk related to a quiescent volcano
Resumo:
Pounamu (NZ jade), or nephrite, is a protected mineral in its natural form following the transfer of ownership back to Ngai Tahu under the Ngai Tahu (Pounamu Vesting) Act 1997. Any theft of nephrite is prosecutable under the Crimes Act 1961. Scientific evidence is essential in cases where origin is disputed. A robust method for discrimination of this material through the use of elemental analysis and compositional data analysis is required. Initial studies have characterised the variability within a given nephrite source. This has included investigation of both in situ outcrops and alluvial material. Methods for the discrimination of two geographically close nephrite sources are being developed. Key Words: forensic, jade, nephrite, laser ablation, inductively coupled plasma mass spectrometry, multivariate analysis, elemental analysis, compositional data analysis
Resumo:
The Hardy-Weinberg law, formulated about 100 years ago, states that under certain assumptions, the three genotypes AA, AB and BB at a bi-allelic locus are expected to occur in the proportions p2, 2pq, and q2 respectively, where p is the allele frequency of A, and q = 1-p. There are many statistical tests being used to check whether empirical marker data obeys the Hardy-Weinberg principle. Among these are the classical xi-square test (with or without continuity correction), the likelihood ratio test, Fisher's Exact test, and exact tests in combination with Monte Carlo and Markov Chain algorithms. Tests for Hardy-Weinberg equilibrium (HWE) are numerical in nature, requiring the computation of a test statistic and a p-value. There is however, ample space for the use of graphics in HWE tests, in particular for the ternary plot. Nowadays, many genetical studies are using genetical markers known as Single Nucleotide Polymorphisms (SNPs). SNP data comes in the form of counts, but from the counts one typically computes genotype frequencies and allele frequencies. These frequencies satisfy the unit-sum constraint, and their analysis therefore falls within the realm of compositional data analysis (Aitchison, 1986). SNPs are usually bi-allelic, which implies that the genotype frequencies can be adequately represented in a ternary plot. Compositions that are in exact HWE describe a parabola in the ternary plot. Compositions for which HWE cannot be rejected in a statistical test are typically “close" to the parabola, whereas compositions that differ significantly from HWE are “far". By rewriting the statistics used to test for HWE in terms of heterozygote frequencies, acceptance regions for HWE can be obtained that can be depicted in the ternary plot. This way, compositions can be tested for HWE purely on the basis of their position in the ternary plot (Graffelman & Morales, 2008). This leads to nice graphical representations where large numbers of SNPs can be tested for HWE in a single graph. Several examples of graphical tests for HWE (implemented in R software), will be shown, using SNP data from different human populations
Resumo:
Theory of compositional data analysis is often focused on the composition only. However in practical applications we often treat a composition together with covariables with some other scale. This contribution systematically gathers and develop statistical tools for this situation. For instance, for the graphical display of the dependence of a composition with a categorical variable, a colored set of ternary diagrams might be a good idea for a first look at the data, but it will fast hide important aspects if the composition has many parts, or it takes extreme values. On the other hand colored scatterplots of ilr components could not be very instructive for the analyst, if the conventional, black-box ilr is used. Thinking on terms of the Euclidean structure of the simplex, we suggest to set up appropriate projections, which on one side show the compositional geometry and on the other side are still comprehensible by a non-expert analyst, readable for all locations and scales of the data. This is e.g. done by defining special balance displays with carefully- selected axes. Following this idea, we need to systematically ask how to display, explore, describe, and test the relation to complementary or explanatory data of categorical, real, ratio or again compositional scales. This contribution shows that it is sufficient to use some basic concepts and very few advanced tools from multivariate statistics (principal covariances, multivariate linear models, trellis or parallel plots, etc.) to build appropriate procedures for all these combinations of scales. This has some fundamental implications in their software implementation, and how might they be taught to analysts not already experts in multivariate analysis
Resumo:
A novel metric comparison of the appendicular skeleton (fore and hind limb) of different vertebrates using the Compositional Data Analysis (CDA) methodological approach it’s presented. 355 specimens belonging in various taxa of Dinosauria (Sauropodomorpha, Theropoda, Ornithischia and Aves) and Mammalia (Prothotheria, Metatheria and Eutheria) were analyzed with CDA. A special focus has been put on Sauropodomorpha dinosaurs and the Aitchinson distance has been used as a measure of disparity in limb elements proportions to infer some aspects of functional morphology
Resumo:
The human gut microbiota comprises a diverse microbial consortium closely co-evolved with the human genome and diet. The importance of the gut microbiota in regulating human health and disease has however been largely overlooked due to the inaccessibility of the intestinal habitat, the complexity of the gut microbiota itself and the fact that many of its members resist cultivation and are in fact new to science. However, with the emergence of 16S rRNA molecular tools and "post-genomics" high resolution technologies for examining microorganisms as they occur in nature without the need for prior laboratory culture, this limited view of the gut microbiota is rapidly changing. This review will discuss the application of molecular microbiological tools to study the human gut microbiota in a culture independent manner. Genomics or metagenomics approaches have a tremendous capability to generate compositional data and to measure the metabolic potential encoded by the combined genomes of the gut microbiota. Another post-genomics approach, metabonomics, has the capacity to measure the metabolic kinetic or flux of metabolites through an ecosystem at a particular point in time or over a time course. Metabonomics thus derives data on the function of the gut microbiota in situ and how it responds to different environmental stimuli e. g. substrates like prebiotics, antibiotics and other drugs and in response to disease. Recently these two culture independent, high resolution approaches have been combined into a single "transgenomic" approach which allows correlation of changes in metabolite profiles within human biofluids with microbiota compositional metagenomic data. Such approaches are providing novel insight into the composition, function and evolution of our gut microbiota.
Resumo:
The human gut microbiota comprises a diverse microbial consortium closely co-evolved with the human genome and diet. The importance of the gut microbiota in regulating human health and disease has however been largely overlooked due to the inaccessibility of the intestinal habitat, the complexity of the gut microbiota itself and the fact that many of its members resist cultivation and are in fact new to science. However, with the emergence of 16S rRNA molecular tools and "post-genomics" high resolution technologies for examining microorganisms as they occur in nature without the need for prior laboratory culture, this limited view of the gut microbiota is rapidly changing. This review will discuss the application of molecular microbiological tools to study the human gut microbiota in a culture independent manner. Genomics or metagenomics approaches have a tremendous capability to generate compositional data and to measure the metabolic potential encoded by the combined genomes of the gut microbiota. Another post-genomics approach, metabonomics, has the capacity to measure the metabolic kinetic or flux of metabolites through an ecosystem at a particular point in time or over a time course. Metabonomics thus derives data on the function of the gut microbiota in situ and how it responds to different environmental stimuli e.g. substrates like prebiotics, antibiotics and other drugs and in response to disease. Recently these two culture independent, high resolution approaches have been combined into a single "transgenomic" approach which allows correlation of changes in metabolite profiles within human biofluids with microbiota compositional metagenomic data. Such approaches are providing novel insight into the composition, function and evolution of our gut microbiota.
Resumo:
One of the main questions on Neoproterozoic geology regards the extent and dynamics of the glacial systems that are recorded in all continents. We present evidence for short transport distances and localized sediment sources for the Bebedouro Formation, which records Neoproterozoic glaciomarine sedimentation in the central-eastern Sao Francisco Craton (SFC), Brazil. New data are presented on clast composition, based on point counting in thin section and SHRIMP dating of pebbles and detrital zircon. Cluster analysis of clast compositional data revealed a pronounced spatial variability of clast composition on diamictite indicating the presence of individual glaciers or ice streams feeding the basin. Detrital zircon ages reveal distinct populations of Archean and Palaeoproterozoic age. The youngest detrital zircon dated at 874 +/- 9 Ma constrains the maximum depositional age of these diamictites. We interpret the provenance of the glacial diamictites to be restricted to sources inside the SFC, suggesting deposition in an environment similar to ice streams from modern, high latitude glaciers.
Resumo:
The stratigraphic subdivision and correlation of dune deposits is difficult, especially when age datings are not available. A better understanding of the controls on texture and composition of eolian sands is necessary to interpret ancient eolian sediments. The Imbituba-Jaguaruna coastal zone (Southern Brazil, 28 degrees-29 degrees S) stands out due to its four well-preserved Late Pleistocene (eolian generation 1) to Holocene eolian units (eolian generations 2, 3, and 4). In this study, we evaluate the grain-size and heavy-mineral characteristics of the Imbituba-Jaguartma eolian units through statistical analysis of hundreds of sediment samples. Grain-size parameters and heavy-mineral content allow us to distinguish the Pleistocene from the Holocene units. The grain size displays a pattern of fining and better sorting from generation 1 (older) to 4 (younger), whereas the content of mechanically stable (dense and hard) heavy minerals decreases from eolian generation 1 to 4. The variation in grain size and heavy-mineral content records shifts in the origin and balance (input versus output) of eolian sediment supply attributable mainly to relative sea-level changes. Dunefields submitted to relative sea-level lowstand conditions (eolian generation 1) are characterized by lower accumulation rates and intense post-depositional dissection by fluvial incision. Low accumulation rates favor deflation in the eolian system, which promotes concentration of denser and stable heavy minerals (increase of ZTR index) as well as coarsening of eolian sands. Dissection involves the selective removal of finer sediments and less dense heavy minerals to the coastal source area. Under a high rate of relative sea-level rise and transgression (eolian generation 2), coastal erosion prevents deflation through high input of sediments to the coastal eolian source. This condition favors dunefield growth. Coastal erosion feeds sand from local sources to the eolian system. including sands from previous dunefields (eolian generation 1) and from drowned incised valleys. Therefore, dunefields corresponding to transgressive phases inherit the grain-size and heavy-mineral characteristics of previous dunefields, leading to selective enrichment of finer sands and lighter minerals. Eolian generations 3 and 4 developed during a regressive-progradational phase (Holocene relative sea level highstand). The high rate of sediment supply during the highstand phase prevents deflation. The lack of coastal erosion favors sediment supply from distal sources (fluvial sediments rich in unstable heavy minerals). Thus, dunefields of transgressive and highstand systems tracts may be distinguished from dunefields of the lowstand systems tract through high rates of accumulation (low deflation) in the former. The sediment source of the transgressive dunefields (high input of previously deposited coastal sands) differs from that of the highstand dunefields (high input of fluvial distal sands). Based on this case study, we propose a general framework for the relation between relative sea level, sediment supply and the texture and mineralogy of eolian sediments deposited in siliciclastic wet coastal zones similar to the Imbituba-Jaguaruna coast (C) 2009 Elsevier B.V. All rights reserved.
Resumo:
A resposta da goiabeira à calagem e à adubação pode ser monitorada por análises de tecido vegetal. O perfil nutricional é definido em relação a padrões de teores de nutrientes. No entanto, os teores de nutrientes-padrão são constantemente criticados por não considerarem as interações que ocorrem entre nutrientes e por gerarem tendências numéricas, decorrentes da redundância dos dados, da dependência de escala e da distribuição não normal. As técnicas de análise composicional de dados podem controlar esses dados tendenciosos, equilibrando os grupos de nutrientes, tais como os envolvidos na calagem e na adubação. A utilização das relações log isométricas (ilr) ortonormais, sequencialmente dispostas, evita tendências numéricas inerentes aos dados de composição. Os objetivos do trabalho foram relacionar o balanço de nutrientes dos tecidos vegetais com a produção de goiabeiras em pomares de 'Paluma' diferentemente corrigidos e adubados, e ajustar os atuais padrões de nutrientes com a faixa de equilíbrio das goiabeiras mais produtivas. Um experimento de calagem de sete anos e três, experimentos de três anos com doses de N, P2O5 e K2O, foram conduzidos em pomares de goiabeiras 'Paluma' em um Latossolo Vermelho-Amarelo. Os teores de N, P, K, Ca e Mg na planta foram monitorados anualmente. Selecionaram-se os balanços [N, P, K | Ca, Mg], [N, P | K], [N | P] e [Ca | Mg] para separar os efeitos da calagem (Ca-Mg) e dos fertilizantes (N-K) nos balanços de macronutrientes. Os balanços foram mais influenciados pela calagem do que pela fertilização. A produtividade das goiabeiras e seu balanço nutricional permitiram a definição de faixas de equilíbrio de nutrientes e sua validação com as faixas de concentrações críticas atualmente utilizadas no Brasil e combinadas em coordenadas ilr.
Resumo:
Compositional data from 152 stingless bee (Meliponini) honey samples were compiled from studies since 1964, and evaluated to propose a quality standard for this product. Since stingless bee honey has a different composition than Apis mellifera honey, some physicochemical parameters are presented according to stingless bee species. The entomological origin of the honey was known for 17 species of Meliponini from Brazil, one from Costa Rica, six from Mexico, 27 from Panama, one from Surinam, two from Trinidad & Tobago, and seven from Venezuela, most from the genus Melipona. The results varied as follows: moisture (19.9-41.9g/100g), pH (3.15-4.66), free acidity (5.9-109.0meq/Kg), ash (0.01-1.18g/100g), diastase activity (0.9-23.0DN), electrical conductivity (0.49-8.77mS/cm), HMF (0.4-78.4mg/Kg), invertase activity (19.8-90.1IU), nitrogen (14.34-144.00mg/100g), reducing sugars (58.0-75.7g/100g) and sucrose (1.1-4.8g/100g). Moisture content of stingless bee honey is generally higher than the 20% maximum established for A. mellifera honey. Guidelines for further contributions would help make the physicochemical database of meliponine honey more objective, in order to use such data to set quality standards. Pollen analysis should be directed towards the recognition of unifloral honeys produced by stingless bees, in order to obtain standard products from botanical species. A honey quality control campaign directed to both stingless beekeepers and stingless bee honey hunters is needed, as is harmonization of analytical methods. © 2007 Asociación Interciencia.
Resumo:
Fertilization of guava relies on soil and tissue testing. The interpretation of tissue test is currently conducted by comparing nutrient concentrations or dual ratios with critical values or ranges. The critical value approach is affected by nutrient interactions. Nutrient interactions can be described by dual ratios where two nutrients are compressed into a single expression or a ternary diagrams where one redundant proportion can be computed by difference between 100% and the sum of the other two. There are D(D-1) possible dual ratios in a D-parts composition and most of them are thus redundant. Nutrients are components of a mixture that convey relative, not absolute information on the composition. There are D-1 balances between components or ingredients in any mixture. Compositional data are intrinsically redundant, scale dependent and non-normally distributed. Based on the principles of equilibrium and orthogonality, the nutrient balance concept projects D-1 isometric log ratio (ilr) coordinates into the Euclidean space. The D-1 balances between groups of nutrients are ordered to reflect knowledge in plant physiology, soil fertility and crop management. Our objective was to evaluate the ilr approach using nutrient data from a guava orchard survey and fertilizer trials across the state of São Paulo, Brazil. Cationic balances varied widely between orchards. We found that the Redfield N/P ratio of 13 was critical for high guava yield. We present guava yield maps in ternary diagrams. Although the ratio between nutrients changing in the same direction with time is often assumed to be stationary, most guava nutrient balances and dual ratios were found to be non-stationary. The ilr model provided an unbiased nutrient diagnosis of guava. © ISHS.
Resumo:
Pós-graduação em Matematica Aplicada e Computacional - FCT
Resumo:
Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)