Biblioteca Digital

9 resultados para imputation

em Consorci de Serveis Universitaris de Catalunya (CSUC), Spain

A note on the Lorenz-maximal allocations in the imputation set

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this note we introduce the Lorenz stable set and provide an axiomatic characterization in terms of constrained egalitarianism and projection consistency. On the domain of all coalitional games, we find that this solution connects the weak constrained egalitarian solution (Dutta and Ray, 1989) with their strong counterpart (Dutta and Ray, 1991)

Veja mais

Inference of distributional parameters from compositional samples containing nondetects

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Low concentrations of elements in geochemical analyses have the peculiarity of beingcompositional data and, for a given level of significance, are likely to be beyond thecapabilities of laboratories to distinguish between minute concentrations and completeabsence, thus preventing laboratories from reporting extremely low concentrations of theanalyte. Instead, what is reported is the detection limit, which is the minimumconcentration that conclusively differentiates between presence and absence of theelement. A spatially distributed exhaustive sample is employed in this study to generateunbiased sub-samples, which are further censored to observe the effect that differentdetection limits and sample sizes have on the inference of population distributionsstarting from geochemical analyses having specimens below detection limit (nondetects).The isometric logratio transformation is used to convert the compositional data in thesimplex to samples in real space, thus allowing the practitioner to properly borrow fromthe large source of statistical techniques valid only in real space. The bootstrap method isused to numerically investigate the reliability of inferring several distributionalparameters employing different forms of imputation for the censored data. The casestudy illustrates that, in general, best results are obtained when imputations are madeusing the distribution best fitting the readings above detection limit and exposes theproblems of other more widely used practices. When the sample is spatially correlated, itis necessary to combine the bootstrap with stochastic simulation

Veja mais

Markov chain montecarlo method applied to rounding zeros of compositional data: first approach

Relevância:

10.00% 10.00%

Publicador:

Resumo:

As stated in Aitchison (1986), a proper study of relative variation in a compositional data set should be based on logratios, and dealing with logratios excludes dealing with zeros. Nevertheless, it is clear that zero observations might be present in real data sets, either because the corresponding part is completelyabsent –essential zeros– or because it is below detection limit –rounded zeros. Because the second kind of zeros is usually understood as “a trace too small to measure”, it seems reasonable to replace them by a suitable small value, and this has been the traditional approach. As stated, e.g. by Tauber (1999) and byMartín-Fernández, Barceló-Vidal, and Pawlowsky-Glahn (2000), the principal problem in compositional data analysis is related to rounded zeros. One should be careful to use a replacement strategy that does not seriously distort the general structure of the data. In particular, the covariance structure of the involvedparts –and thus the metric properties– should be preserved, as otherwise further analysis on subpopulations could be misleading. Following this point of view, a non-parametric imputation method isintroduced in Martín-Fernández, Barceló-Vidal, and Pawlowsky-Glahn (2000). This method is analyzed in depth by Martín-Fernández, Barceló-Vidal, and Pawlowsky-Glahn (2003) where it is shown that thetheoretical drawbacks of the additive zero replacement method proposed in Aitchison (1986) can be overcome using a new multiplicative approach on the non-zero parts of a composition. The new approachhas reasonable properties from a compositional point of view. In particular, it is “natural” in the sense thatit recovers the “true” composition if replacement values are identical to the missing values, and it is coherent with the basic operations on the simplex. This coherence implies that the covariance structure of subcompositions with no zeros is preserved. As a generalization of the multiplicative replacement, in thesame paper a substitution method for missing values on compositional data sets is introduced

Veja mais

alr approach for replacing values below the detection limit

Relevância:

10.00% 10.00%

Publicador:

Resumo:

All of the imputation techniques usually applied for replacing values below thedetection limit in compositional data sets have adverse effects on the variability. In thiswork we propose a modification of the EM algorithm that is applied using the additivelog-ratio transformation. This new strategy is applied to a compositional data set and theresults are compared with the usual imputation techniques

Veja mais

When zero doesn't mean it and other geomathematical mischief

Relevância:

10.00% 10.00%

Publicador:

Resumo:

There is almost not a case in exploration geology, where the studied data doesn’tincludes below detection limits and/or zero values, and since most of the geological dataresponds to lognormal distributions, these “zero data” represent a mathematicalchallenge for the interpretation.We need to start by recognizing that there are zero values in geology. For example theamount of quartz in a foyaite (nepheline syenite) is zero, since quartz cannot co-existswith nepheline. Another common essential zero is a North azimuth, however we canalways change that zero for the value of 360°. These are known as “Essential zeros”, butwhat can we do with “Rounded zeros” that are the result of below the detection limit ofthe equipment?Amalgamation, e.g. adding Na2O and K2O, as total alkalis is a solution, but sometimeswe need to differentiate between a sodic and a potassic alteration. Pre-classification intogroups requires a good knowledge of the distribution of the data and the geochemicalcharacteristics of the groups which is not always available. Considering the zero valuesequal to the limit of detection of the used equipment will generate spuriousdistributions, especially in ternary diagrams. Same situation will occur if we replace thezero values by a small amount using non-parametric or parametric techniques(imputation).The method that we are proposing takes into consideration the well known relationshipsbetween some elements. For example, in copper porphyry deposits, there is always agood direct correlation between the copper values and the molybdenum ones, but whilecopper will always be above the limit of detection, many of the molybdenum values willbe “rounded zeros”. So, we will take the lower quartile of the real molybdenum valuesand establish a regression equation with copper, and then we will estimate the“rounded” zero values of molybdenum by their corresponding copper values.The method could be applied to any type of data, provided we establish first theircorrelation dependency.One of the main advantages of this method is that we do not obtain a fixed value for the“rounded zeros”, but one that depends on the value of the other variable.Key words: compositional data analysis, treatment of zeros, essential zeros, roundedzeros, correlation dependency

Veja mais

Performance assessment and league tables. Comparing like with like.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

We formulate performance assessment as a problem of causal analysis and outline an approach based on the missing data principle for its solution. It is particularly relevant in the context of so-called league tables for educational, health-care and other public-service institutions. The proposed solution avoids comparisons of institutions that have substantially different clientele (intake).

Veja mais

Performance assessment and league tables. Comparing like with like

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Veja mais

A house price index defined in the potential outcomes framework

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Current methods for constructing house price indices are based on comparisons of sale prices of residential properties sold two or more times and on regression of the sale prices on the attributes of the properties and of their locations. The two methods have well recognised deficiencies, selection bias and model assumptions, respectively. We introduce a new method based on propensity score matching. The average house prices for two periods are compared by selecting pairs of properties, one sold in each period, that are as similar on a set of available attributes (covariates) as is feasible to arrange. The uncertainty associated with such matching is addressed by multiple imputation, framing the problem as involving missing values. The method is applied to aregister of transactions ofresidential properties in New Zealand and compared with the established alternatives.

Veja mais

Polygenic determinants of white matter volume derived from GWAS lack reproducibility in a replicate sample

Relevância:

10.00% 10.00%

Publicador:

Resumo:

A recent publication reported an exciting polygenic effect of schizophrenia (SCZ) risk variants, identified by a large genome-wide association study (GWAS), on total brain and white matter volumes in schizophrenic patients and, even more prominently, in healthy subjects. The aim of the present work was to replicate and then potentially extend these findings. According to the original publication, polygenic risk scores using single nucleotide polymorphism (SNP) information of SCZ GWAS (polygenic SCZ risk scores; PSS) were calculated in 122 healthy subjects, enrolled in a structural magnetic resonance imaging (MRI) study. These scores were computed based on P-values and odds ratios available through the Psychiatric GWAS Consortium. In addition, polygenic white matter scores (PWM) were calculated, using the respective SNP subset in the original publication. None of the polygenic scores, either PSS or PWM, were found to be associated with total brain, white matter or gray matter volume in our replicate sample. Minor differences between the original and the present study that might have contributed to lack of reproducibility (but unlikely explain it fully), are number of subjects, ethnicity, age distribution, array technology, SNP imputation quality and MRI scanner type. In contrast to the original publication, our results do not reveal the slightest signal of association of the described sets of GWAS-identified SCZ risk variants with brain volumes in adults. Caution is indicated in interpreting studies building on polygenic risk scores without replication sample.

Veja mais

9 resultados para imputation

em Consorci de Serveis Universitaris de Catalunya (CSUC), Spain

Filtro por publicador