1000 resultados para Martín Biedma
Resumo:
The Hardy-Weinberg law, formulated about 100 years ago, states that under certainassumptions, the three genotypes AA, AB and BB at a bi-allelic locus are expected to occur inthe proportions p2, 2pq, and q2 respectively, where p is the allele frequency of A, and q = 1-p.There are many statistical tests being used to check whether empirical marker data obeys theHardy-Weinberg principle. Among these are the classical xi-square test (with or withoutcontinuity correction), the likelihood ratio test, Fisher's Exact test, and exact tests in combinationwith Monte Carlo and Markov Chain algorithms. Tests for Hardy-Weinberg equilibrium (HWE)are numerical in nature, requiring the computation of a test statistic and a p-value.There is however, ample space for the use of graphics in HWE tests, in particular for the ternaryplot. Nowadays, many genetical studies are using genetical markers known as SingleNucleotide Polymorphisms (SNPs). SNP data comes in the form of counts, but from the countsone typically computes genotype frequencies and allele frequencies. These frequencies satisfythe unit-sum constraint, and their analysis therefore falls within the realm of compositional dataanalysis (Aitchison, 1986). SNPs are usually bi-allelic, which implies that the genotypefrequencies can be adequately represented in a ternary plot. Compositions that are in exactHWE describe a parabola in the ternary plot. Compositions for which HWE cannot be rejected ina statistical test are typically “close" to the parabola, whereas compositions that differsignificantly from HWE are “far". By rewriting the statistics used to test for HWE in terms ofheterozygote frequencies, acceptance regions for HWE can be obtained that can be depicted inthe ternary plot. This way, compositions can be tested for HWE purely on the basis of theirposition in the ternary plot (Graffelman & Morales, 2008). This leads to nice graphicalrepresentations where large numbers of SNPs can be tested for HWE in a single graph. Severalexamples of graphical tests for HWE (implemented in R software), will be shown, using SNPdata from different human populations
Resumo:
The amalgamation operation is frequently used to reduce the number of parts of compositional data but it is a non-linear operation in the simplex with the usual geometry,the Aitchison geometry. The concept of balances between groups, a particular coordinate system designed over binary partitions of the parts, could be an alternative to theamalgamation in some cases. In this work we discuss the proper application of bothconcepts using a real data set corresponding to behavioral measures of pregnant sows
Resumo:
Planners in public and private institutions would like coherent forecasts of the components of age-specic mortality, such as causes of death. This has been di cult toachieve because the relative values of the forecast components often fail to behave ina way that is coherent with historical experience. In addition, when the group forecasts are combined the result is often incompatible with an all-groups forecast. It hasbeen shown that cause-specic mortality forecasts are pessimistic when compared withall-cause forecasts (Wilmoth, 1995). This paper abandons the conventional approachof using log mortality rates and forecasts the density of deaths in the life table. Sincethese values obey a unit sum constraint for both conventional single-decrement life tables (only one absorbing state) and multiple-decrement tables (more than one absorbingstate), they are intrinsically relative rather than absolute values across decrements aswell as ages. Using the methods of Compositional Data Analysis pioneered by Aitchison(1986), death densities are transformed into the real space so that the full range of multivariate statistics can be applied, then back-transformed to positive values so that theunit sum constraint is honoured. The structure of the best-known, single-decrementmortality-rate forecasting model, devised by Lee and Carter (1992), is expressed incompositional form and the results from the two models are compared. The compositional model is extended to a multiple-decrement form and used to forecast mortalityby cause of death for Japan
Resumo:
Theory of compositional data analysis is often focused on the composition only. However in practical applications we often treat a composition together with covariableswith some other scale. This contribution systematically gathers and develop statistical tools for this situation. For instance, for the graphical display of the dependenceof a composition with a categorical variable, a colored set of ternary diagrams mightbe a good idea for a first look at the data, but it will fast hide important aspects ifthe composition has many parts, or it takes extreme values. On the other hand colored scatterplots of ilr components could not be very instructive for the analyst, if theconventional, black-box ilr is used.Thinking on terms of the Euclidean structure of the simplex, we suggest to set upappropriate projections, which on one side show the compositional geometry and on theother side are still comprehensible by a non-expert analyst, readable for all locations andscales of the data. This is e.g. done by defining special balance displays with carefully-selected axes. Following this idea, we need to systematically ask how to display, explore,describe, and test the relation to complementary or explanatory data of categorical, real,ratio or again compositional scales.This contribution shows that it is sufficient to use some basic concepts and very fewadvanced tools from multivariate statistics (principal covariances, multivariate linearmodels, trellis or parallel plots, etc.) to build appropriate procedures for all these combinations of scales. This has some fundamental implications in their software implementation, and how might they be taught to analysts not already experts in multivariateanalysis
Resumo:
BACKGROUND: The elongase of long chain fatty acids family 6 (ELOVL6) is an enzyme that specifically catalyzes the elongation of saturated and monounsaturated fatty acids with 12, 14 and 16 carbons. ELOVL6 is expressed in lipogenic tissues and it is regulated by sterol regulatory element binding protein 1 (SREBP-1). OBJECTIVE: We investigated whether ELOVL6 genetic variation is associated with insulin sensitivity in a population from southern Spain. DESIGN: We undertook a prospective, population-based study collecting phenotypic, metabolic, nutritional and genetic information. Measurements were made of weight and height and the body mass index (BMI) was calculated. Insulin resistance was measured by homeostasis model assessment. The type of dietary fat was assessed from samples of cooking oil taken from the participants' kitchens and analyzed by gas chromatography. Five SNPs of the ELOVL6 gene were analyzed by SNPlex. RESULTS: Carriers of the minor alleles of the SNPs rs9997926 and rs6824447 had a lower risk of having high HOMA_IR, whereas carriers of the minor allele rs17041272 had a higher risk of being insulin resistant. An interaction was detected between the rs6824447 polymorphism and the intake of oil in relation with insulin resistance, such that carriers of this minor allele who consumed sunflower oil had lower HOMA_IR than those who did not have this allele (P = 0.001). CONCLUSIONS: Genetic variations in the ELOVL6 gene were associated with insulin sensitivity in this population-based study.
Resumo:
Self-organizing maps (Kohonen 1997) is a type of artificial neural network developedto explore patterns in high-dimensional multivariate data. The conventional versionof the algorithm involves the use of Euclidean metric in the process of adaptation ofthe model vectors, thus rendering in theory a whole methodology incompatible withnon-Euclidean geometries.In this contribution we explore the two main aspects of the problem:1. Whether the conventional approach using Euclidean metric can shed valid resultswith compositional data.2. If a modification of the conventional approach replacing vectorial sum and scalarmultiplication by the canonical operators in the simplex (i.e. perturbation andpowering) can converge to an adequate solution.Preliminary tests showed that both methodologies can be used on compositional data.However, the modified version of the algorithm performs poorer than the conventionalversion, in particular, when the data is pathological. Moreover, the conventional ap-proach converges faster to a solution, when data is \well-behaved".Key words: Self Organizing Map; Artificial Neural networks; Compositional data
Resumo:
In most psychological tests and questionnaires, a test score is obtained bytaking the sum of the item scores. In virtually all cases where the test orquestionnaire contains multidimensional forced-choice items, this traditionalscoring method is also applied. We argue that the summation of scores obtained with multidimensional forced-choice items produces uninterpretabletest scores. Therefore, we propose three alternative scoring methods: a weakand a strict rank preserving scoring method, which both allow an ordinalinterpretation of test scores; and a ratio preserving scoring method, whichallows a proportional interpretation of test scores. Each proposed scoringmethod yields an index for each respondent indicating the degree to whichthe response pattern is inconsistent. Analysis of real data showed that withrespect to rank preservation, the weak and strict rank preserving methodresulted in lower inconsistency indices than the traditional scoring method;with respect to ratio preservation, the ratio preserving scoring method resulted in lower inconsistency indices than the traditional scoring method
Resumo:
Functional Data Analysis (FDA) deals with samples where a whole function is observedfor each individual. A particular case of FDA is when the observed functions are densityfunctions, that are also an example of infinite dimensional compositional data. In thiswork we compare several methods for dimensionality reduction for this particular typeof data: functional principal components analysis (PCA) with or without a previousdata transformation and multidimensional scaling (MDS) for diferent inter-densitiesdistances, one of them taking into account the compositional nature of density functions. The difeerent methods are applied to both artificial and real data (householdsincome distributions)
Resumo:
Many multivariate methods that are apparently distinct can be linked by introducing oneor more parameters in their definition. Methods that can be linked in this way arecorrespondence analysis, unweighted or weighted logratio analysis (the latter alsoknown as "spectral mapping"), nonsymmetric correspondence analysis, principalcomponent analysis (with and without logarithmic transformation of the data) andmultidimensional scaling. In this presentation I will show how several of thesemethods, which are frequently used in compositional data analysis, may be linkedthrough parametrizations such as power transformations, linear transformations andconvex linear combinations. Since the methods of interest here all lead to visual mapsof data, a "movie" can be made where where the linking parameter is allowed to vary insmall steps: the results are recalculated "frame by frame" and one can see the smoothchange from one method to another. Several of these "movies" will be shown, giving adeeper insight into the similarities and differences between these methods
Resumo:
The preceding two editions of CoDaWork included talks on the possible considerationof densities as infinite compositions: Egozcue and D´ıaz-Barrero (2003) extended theEuclidean structure of the simplex to a Hilbert space structure of the set of densitieswithin a bounded interval, and van den Boogaart (2005) generalized this to the setof densities bounded by an arbitrary reference density. From the many variations ofthe Hilbert structures available, we work with three cases. For bounded variables, abasis derived from Legendre polynomials is used. For variables with a lower bound, westandardize them with respect to an exponential distribution and express their densitiesas coordinates in a basis derived from Laguerre polynomials. Finally, for unboundedvariables, a normal distribution is used as reference, and coordinates are obtained withrespect to a Hermite-polynomials-based basis.To get the coordinates, several approaches can be considered. A numerical accuracyproblem occurs if one estimates the coordinates directly by using discretized scalarproducts. Thus we propose to use a weighted linear regression approach, where all k-order polynomials are used as predictand variables and weights are proportional to thereference density. Finally, for the case of 2-order Hermite polinomials (normal reference)and 1-order Laguerre polinomials (exponential), one can also derive the coordinatesfrom their relationships to the classical mean and variance.Apart of these theoretical issues, this contribution focuses on the application of thistheory to two main problems in sedimentary geology: the comparison of several grainsize distributions, and the comparison among different rocks of the empirical distribution of a property measured on a batch of individual grains from the same rock orsediment, like their composition
Resumo:
In this paper we examine the problem of compositional data from a different startingpoint. Chemical compositional data, as used in provenance studies on archaeologicalmaterials, will be approached from the measurement theory. The results will show, in avery intuitive way that chemical data can only be treated by using the approachdeveloped for compositional data. It will be shown that compositional data analysis is aparticular case in projective geometry, when the projective coordinates are in thepositive orthant, and they have the properties of logarithmic interval metrics. Moreover,it will be shown that this approach can be extended to a very large number ofapplications, including shape analysis. This will be exemplified with a case study inarchitecture of Early Christian churches dated back to the 5th-7th centuries AD
Resumo:
A novel metric comparison of the appendicular skeleton (fore and hind limb) ofdifferent vertebrates using the Compositional Data Analysis (CDA) methodologicalapproach it’s presented.355 specimens belonging in various taxa of Dinosauria (Sauropodomorpha, Theropoda,Ornithischia and Aves) and Mammalia (Prothotheria, Metatheria and Eutheria) wereanalyzed with CDA.A special focus has been put on Sauropodomorpha dinosaurs and the Aitchinsondistance has been used as a measure of disparity in limb elements proportions to infersome aspects of functional morphology
Resumo:
Factor analysis as frequent technique for multivariate data inspection is widely used also for compositional data analysis. The usual way is to use a centered logratio (clr)transformation to obtain the random vector y of dimension D. The factor model istheny = Λf + e (1)with the factors f of dimension k & D, the error term e, and the loadings matrix Λ.Using the usual model assumptions (see, e.g., Basilevsky, 1994), the factor analysismodel (1) can be written asCov(y) = ΛΛT + ψ (2)where ψ = Cov(e) has a diagonal form. The diagonal elements of ψ as well as theloadings matrix Λ are estimated from an estimation of Cov(y).Given observed clr transformed data Y as realizations of the random vectory. Outliers or deviations from the idealized model assumptions of factor analysiscan severely effect the parameter estimation. As a way out, robust estimation ofthe covariance matrix of Y will lead to robust estimates of Λ and ψ in (2), seePison et al. (2003). Well known robust covariance estimators with good statisticalproperties, like the MCD or the S-estimators (see, e.g. Maronna et al., 2006), relyon a full-rank data matrix Y which is not the case for clr transformed data (see,e.g., Aitchison, 1986).The isometric logratio (ilr) transformation (Egozcue et al., 2003) solves thissingularity problem. The data matrix Y is transformed to a matrix Z by usingan orthonormal basis of lower dimension. Using the ilr transformed data, a robustcovariance matrix C(Z) can be estimated. The result can be back-transformed tothe clr space byC(Y ) = V C(Z)V Twhere the matrix V with orthonormal columns comes from the relation betweenthe clr and the ilr transformation. Now the parameters in the model (2) can beestimated (Basilevsky, 1994) and the results have a direct interpretation since thelinks to the original variables are still preserved.The above procedure will be applied to data from geochemistry. Our specialinterest is on comparing the results with those of Reimann et al. (2002) for the Kolaproject data
Resumo:
Angiotensin II (Ang II) highly stimulates superoxide anion production by neutrophils. The G-protein Rac2 modulates the activity of NADPH oxidase in response to various stimuli. Here, we describe that Ang II induced both Rac2 translocation from the cytosol to the plasma membrane and Rac2 GTP-binding activity. Furthermore, Clostridium difficile toxin A, an inhibitor of the Rho-GTPases family Rho, Rac and Cdc42, prevented Ang II-elicited O2-/ROS production, phosphorylation of the mitogen-activated protein kinases (MAPKs) p38, extracellular signal-regulated kinase 1/2 (ERK1/2) and c-Jun N-terminal kinase 1/2, and Rac2 activation. Rac2 GTPase inhibition by C. difficile toxin A was accompanied by a robust reduction of the cytosolic Ca(2)(+) elevation induced by Ang II in human neutrophils. Furthermore, SB203580 and PD098059 act as inhibitors of p38MAPK and ERK1/2 respectively, wortmannin, an inhibitor of phosphatidylinositol-3-kinase, and cyclosporin A, a calcineurin inhibitor, hindered both translocation of Rac2 from the cytosol to the plasma membrane and enhancement of Rac2 GTP-binding elicited by Ang II. These results provide evidence that the activation of Rac2 by Ang II is exerted through multiple signalling pathways, involving Ca(2)(+)/calcineurin and protein kinases, the elucidation of which should be insightful in the design of new therapies aimed at reversing the inflammation of vessel walls found in a number of cardiovascular diseases.
Resumo:
Infections of the catheter wound in peritoneal dialysis are the most frequent cause of morbility in patients who undergo this technique. There are a number of procedures for the care of the wound and it is not easy to define a single method that will guarantee good condition of the wound. In order to evaluate the behaviour of the wound related to the procedure used in their care, we studied 306 patients over 24 months, compiling socio-demographic and clinical variables. We found a high incidence of infections caused by gram-positive skin and mucous germs, with a strong correlation with the fact that the patient/family carer is a nasal carrier of staphylococcus aureus and that they appear more frequently in patients who do not remove the wound dressing in the shower. We also detected an increase in pseudomonas infections when the patient does not dry the wound with a hair-dryer