986 resultados para Multivariable analysis


Relevância:

30.00% 30.00%

Publicador:

Resumo:

We compare correspondance análisis to the logratio approach based on compositional data. We also compare correspondance análisis and an alternative approach using Hellinger distance, for representing categorical data in a contingency table. We propose a coefficient which globally measures the similarity between these approaches. This coefficient can be decomposed into several components, one component for each principal dimension, indicating the contribution of the dimensions to the difference between the two representations. These three methods of representation can produce quite similar results. One illustrative example is given

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Starting with logratio biplots for compositional data, which are based on the principle of subcompositional coherence, and then adding weights, as in correspondence analysis, we rediscover Lewi's spectral map and many connections to analyses of two-way tables of non-negative data. Thanks to the weighting, the method also achieves the property of distributional equivalence

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Compositional data naturally arises from the scientific analysis of the chemicalcomposition of archaeological material such as ceramic and glass artefacts. Data of thistype can be explored using a variety of techniques, from standard multivariate methodssuch as principal components analysis and cluster analysis, to methods based upon theuse of log-ratios. The general aim is to identify groups of chemically similar artefactsthat could potentially be used to answer questions of provenance.This paper will demonstrate work in progress on the development of a documentedlibrary of methods, implemented using the statistical package R, for the analysis ofcompositional data. R is an open source package that makes available very powerfulstatistical facilities at no cost. We aim to show how, with the aid of statistical softwaresuch as R, traditional exploratory multivariate analysis can easily be used alongside, orin combination with, specialist techniques of compositional data analysis.The library has been developed from a core of basic R functionality, together withpurpose-written routines arising from our own research (for example that reported atCoDaWork'03). In addition, we have included other appropriate publicly availabletechniques and libraries that have been implemented in R by other authors. Availablefunctions range from standard multivariate techniques through to various approaches tolog-ratio analysis and zero replacement. We also discuss and demonstrate a smallselection of relatively new techniques that have hitherto been little-used inarchaeometric applications involving compositional data. The application of the libraryto the analysis of data arising in archaeometry will be demonstrated; results fromdifferent analyses will be compared; and the utility of the various methods discussed

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We shall call an n × p data matrix fully-compositional if the rows sum to a constant, and sub-compositional if the variables are a subset of a fully-compositional data set1. Such data occur widely in archaeometry, where it is common to determine the chemical composition of ceramic, glass, metal or other artefacts using techniques such as neutron activation analysis (NAA), inductively coupled plasma spectroscopy (ICPS), X-ray fluorescence analysis (XRF) etc. Interest often centres on whether there are distinct chemical groups within the data and whether, for example, these can be associated with different origins or manufacturing technologies

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Presentation in CODAWORK'03, session 4: Applications to archeometry

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Developments in the statistical analysis of compositional data over the last twodecades have made possible a much deeper exploration of the nature of variability,and the possible processes associated with compositional data sets from manydisciplines. In this paper we concentrate on geochemical data sets. First we explainhow hypotheses of compositional variability may be formulated within the naturalsample space, the unit simplex, including useful hypotheses of subcompositionaldiscrimination and specific perturbational change. Then we develop through standardmethodology, such as generalised likelihood ratio tests, statistical tools to allow thesystematic investigation of a complete lattice of such hypotheses. Some of these tests are simple adaptations of existing multivariate tests but others require specialconstruction. We comment on the use of graphical methods in compositional dataanalysis and on the ordination of specimens. The recent development of the conceptof compositional processes is then explained together with the necessary tools for astaying- in-the-simplex approach, namely compositional singular value decompositions. All these statistical techniques are illustrated for a substantial compositional data set, consisting of 209 major-oxide and rare-element compositions of metamorphosed limestones from the Northeast and Central Highlands of Scotland.Finally we point out a number of unresolved problems in the statistical analysis ofcompositional processes

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The use of perturbation and power transformation operations permits the investigation of linear processes in the simplex as in a vectorial space. When the investigated geochemical processes can be constrained by the use of well-known starting point, the eigenvectors of the covariance matrix of a non-centred principalcomponent analysis allow to model compositional changes compared with a reference point.The results obtained for the chemistry of water collected in River Arno (central-northern Italy) have open new perspectives for considering relative changes of the analysed variables and to hypothesise the relative effect of different acting physical-chemical processes, thus posing the basis for a quantitative modelling

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Usually, psychometricians apply classical factorial analysis to evaluate construct validity of order rankscales. Nevertheless, these scales have particular characteristics that must be taken into account: totalscores and rank are highly relevant

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper addresses the application of a PCA analysis on categorical data prior to diagnose a patients data set using a Case-Based Reasoning (CBR) system. The particularity is that the standard PCA techniques are designed to deal with numerical attributes, but our medical data set contains many categorical data and alternative methods as RS-PCA are required. Thus, we propose to hybridize RS-PCA (Regular Simplex PCA) and a simple CBR. Results show how the hybrid system produces similar results when diagnosing a medical data set, that the ones obtained when using the original attributes. These results are quite promising since they allow to diagnose with less computation effort and memory storage

Relevância:

30.00% 30.00%

Publicador:

Resumo:

First discussion on compositional data analysis is attributable to Karl Pearson, in 1897. However, notwithstanding the recent developments on algebraic structure of the simplex, more than twenty years after Aitchison’s idea of log-transformations of closed data, scientific literature is again full of statistical treatments of this type of data by using traditional methodologies. This is particularly true in environmental geochemistry where besides the problem of the closure, the spatial structure (dependence) of the data have to be considered. In this work we propose the use of log-contrast values, obtained by asimplicial principal component analysis, as LQGLFDWRUV of given environmental conditions. The investigation of the log-constrast frequency distributions allows pointing out the statistical laws able togenerate the values and to govern their variability. The changes, if compared, for example, with the mean values of the random variables assumed as models, or other reference parameters, allow definingmonitors to be used to assess the extent of possible environmental contamination. Case study on running and ground waters from Chiavenna Valley (Northern Italy) by using Na+, K+, Ca2+, Mg2+, HCO3-, SO4 2- and Cl- concentrations will be illustrated

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Pounamu (NZ jade), or nephrite, is a protected mineral in its natural form following thetransfer of ownership back to Ngai Tahu under the Ngai Tahu (Pounamu Vesting) Act 1997.Any theft of nephrite is prosecutable under the Crimes Act 1961. Scientific evidence isessential in cases where origin is disputed. A robust method for discrimination of thismaterial through the use of elemental analysis and compositional data analysis is required.Initial studies have characterised the variability within a given nephrite source. This hasincluded investigation of both in situ outcrops and alluvial material. Methods for thediscrimination of two geographically close nephrite sources are being developed.Key Words: forensic, jade, nephrite, laser ablation, inductively coupled plasma massspectrometry, multivariate analysis, elemental analysis, compositional data analysis

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Intrauterine devices (IUDs), long-acting and reversible contraceptives, induce a number of immunological and biochemical changes in the uterine environment that could affect endometrial cancer (EC) risk. We addressed this relationship through a pooled analysis of data collected in the Epidemiology of Endometrial Cancer Consortium. We combined individual-level data from 4 cohort and 14 case-control studies, in total 8,801 EC cases and 15,357 controls. Using multivariable logistic regression, we estimated pooled odds ratios (pooled-ORs) and 95% confidence intervals (CIs) for EC risk associated with ever use, type of device, ages at first and last use, duration of use and time since last use, stratified by study and adjusted for confounders. Ever use of IUDs was inversely related to EC risk (pooled-OR = 0.81, 95% CI = 0.74-0.90). Compared with never use, reduced risk of EC was observed for inert IUDs (pooled-OR = 0.69, 95% CI = 0.58-0.82), older age at first use (≥35 years pooled-OR = 0.53, 95% CI = 0.43-0.67), older age at last use (≥45 years pooled-OR = 0.60, 95% CI = 0.50-0.72), longer duration of use (≥10 years pooled-OR = 0.61, 95% CI = 0.52-0.71) and recent use (within 1 year of study entry pooled-OR = 0.39, 95% CI = 0.30-0.49). Future studies are needed to assess the respective roles of detection biases and biologic effects related to foreign body responses in the endometrium, heavier bleeding (and increased clearance of carcinogenic cells) and localized hormonal changes.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

One of the disadvantages of old age is that there is more past than future: this,however, may be turned into an advantage if the wealth of experience and, hopefully,wisdom gained in the past can be reflected upon and throw some light on possiblefuture trends. To an extent, then, this talk is necessarily personal, certainly nostalgic,but also self critical and inquisitive about our understanding of the discipline ofstatistics. A number of almost philosophical themes will run through the talk: searchfor appropriate modelling in relation to the real problem envisaged, emphasis onsensible balances between simplicity and complexity, the relative roles of theory andpractice, the nature of communication of inferential ideas to the statistical layman, theinter-related roles of teaching, consultation and research. A list of keywords might be:identification of sample space and its mathematical structure, choices betweentransform and stay, the role of parametric modelling, the role of a sample spacemetric, the underused hypothesis lattice, the nature of compositional change,particularly in relation to the modelling of processes. While the main theme will berelevance to compositional data analysis we shall point to substantial implications forgeneral multivariate analysis arising from experience of the development ofcompositional data analysis

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Enterococci are reportedly the third most common group of endocarditis-causing pathogens but data on enterococcal infective endocarditis (IE) are limited. The aim of this study was to analyse the characteristics and prognostic factors of enterococcal IE within the International Collaboration on Endocarditis. In this multicentre, prospective observational cohort study of 4974 adults with definite IE recorded from June 2000 to September 2006, 500 patients had enterococcal IE. Their characteristics were described and compared with those of oral and group D streptococcal IE. Prognostic factors for enterococcal IE were analysed using multivariable Cox regression models. The patients' mean age was 65 years and 361/500 were male. Twenty-three per cent (117/500) of cases were healthcare related. Enterococcal IE were more frequent than oral and group D streptococcal IE in North America. The 1-year mortality rate was 28.9% (144/500). E. faecalis accounted for 90% (453/500) of enterococcal IE. Resistance to vancomycin was observed in 12 strains, eight of which were observed in North America, where they accounted for 10% (8/79) of enterococcal strains, and was more frequent in E. faecium than in E. faecalis (3/16 vs. 7/364 , p 0.01). Variables significantly associated with 1-year mortality were heart failure (HR 2.4, 95% CI 1.7--3.5, p <0.0001), stroke (HR 1.9, 95% CI 1.3--2.8, p 0.001) and age (HR 1.02 per 1-year increment, 95% CI 1.01--1.04, p 0.002). Surgery was not associated with better outcome. Enterococci are an important cause of IE, with a high mortality rate. Healthcare association and vancomycin resistance are common in particular in North America.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper addresses the application of a PCA analysis on categorical data prior to diagnose a patients data set using a Case-Based Reasoning (CBR) system. The particularity is that the standard PCA techniques are designed to deal with numerical attributes, but our medical data set contains many categorical data and alternative methods as RS-PCA are required. Thus, we propose to hybridize RS-PCA (Regular Simplex PCA) and a simple CBR. Results show how the hybrid system produces similar results when diagnosing a medical data set, that the ones obtained when using the original attributes. These results are quite promising since they allow to diagnose with less computation effort and memory storage