51 resultados para INDEPENDENT COMPONENT ANALYSIS (ICA)


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Els avenços en tècniques de genotipat de polimorfismes genètics a gran escala estan liderant una revolució en el camp de l’epidemiologia genètica i la genètica de poblacions humanes. La informació aportada per aquestes tècniques ha evidenciat l’existència d’estructuracions poblacionals que poden augmentar l’error en els estudis d’associació a escala genòmica (GWAS, genome-wide association studies). Estudis recents han demostrat la presència d’aquestes estructuracions a nivell interregional i intrarregional a Europa. El present projecte ha avaluat el grau d’estructuració genètica en poblacions de la Península Ibèrica i altres regions del sudoest europeu (Itàlia i França) per quantificar l’impacte que aquesta potencial estructuració pot tenir en el disseny d’estudis d’associació GWAS i reconstruir la història demogràfica de les poblacions de la Mediterrània. Per aconseguir aquests objectius, s’han analitzat mostres de DNA de 770 individus de 26 poblacions de la Península Ibèrica, França, Itàlia i d’altres països de la Mediterrània. Aquestes mostres van ser genotipades per 240000 SNPs utilitzant l’array 250K StyI d’Affymetrix en el marc d’aquest projecte o mitjançant altres arrays d’Affymetrix en els projectes internacionals HapMap i POPRES. S’han realitzat anàlisis estadístiques incloent anàlisis de components principals, Fst, identitat per descendència, desequilibri de lligament, barreres genètiques, etc. Aquests resultats han permés construir un marc de referència de la variabilitat en aquesta regió, avaluar el seu impacte en estudis d’associació i proposar mesures per evitar l’increment de qualsevol tipus d’error (tipus I i II) en estudis nacionals i internacionals. A més, també han permés reconstruir la història de les poblacions humanes de la Mediterrània així com analitzar les seves relacions demogràfiques. Donada la duració limitada d’aquesta acció (24 mesos, d’octubre de 2010 a setembre de 2012), els resultats d’aquest projecte es troben actualment en fase de redacció i conduiran a diverses publicacions en revistes internacionals i a la preparació de comunicacions a congressos.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Biplots are graphical displays of data matrices based on the decomposition of a matrix as the product of two matrices. Elements of these two matrices are used as coordinates for the rows and columns of the data matrix, with an interpretation of the joint presentation that relies on the properties of the scalar product. Because the decomposition is not unique, there are several alternative ways to scale the row and column points of the biplot, which can cause confusion amongst users, especially when software packages are not united in their approach to this issue. We propose a new scaling of the solution, called the standard biplot, which applies equally well to a wide variety of analyses such as correspondence analysis, principal component analysis, log-ratio analysis and the graphical results of a discriminant analysis/MANOVA, in fact to any method based on the singular-value decomposition. The standard biplot also handles data matrices with widely different levels of inherent variance. Two concepts taken from correspondence analysis are important to this idea: the weighting of row and column points, and the contributions made by the points to the solution. In the standard biplot one set of points, usually the rows of the data matrix, optimally represent the positions of the cases or sample units, which are weighted and usually standardized in some way unless the matrix contains values that are comparable in their raw form. The other set of points, usually the columns, is represented in accordance with their contributions to the low-dimensional solution. As for any biplot, the projections of the row points onto vectors defined by the column points approximate the centred and (optionally) standardized data. The method is illustrated with several examples to demonstrate how the standard biplot copes in different situations to give a joint map which needs only one common scale on the principal axes, thus avoiding the problem of enlarging or contracting the scale of one set of points to make the biplot readable. The proposal also solves the problem in correspondence analysis of low-frequency categories that are located on the periphery of the map, giving the false impression that they are important.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In order to interpret the biplot it is necessary to know which points usually variables are the ones that are important contributors to the solution, and this information is available separately as part of the biplot s numerical results. We propose a new scaling of the display, called the contribution biplot, which incorporates this diagnostic directly into the graphical display, showing visually the important contributors and thus facilitating the biplot interpretation and often simplifying the graphical representation considerably. The contribution biplot can be applied to a wide variety of analyses such as correspondence analysis, principal component analysis, log-ratio analysis and the graphical results of a discriminant analysis/MANOVA, in fact to any method based on the singular-value decomposition. In the contribution biplot one set of points, usually the rows of the data matrix, optimally represent the spatial positions of the cases or sample units, according to some distance measure that usually incorporates some form of standardization unless all data are comparable in scale. The other set of points, usually the columns, is represented by vectors that are related to their contributions to the low-dimensional solution. A fringe benefit is that usually only one common scale for row and column points is needed on the principal axes, thus avoiding the problem of enlarging or contracting the scale of one set of points to make the biplot legible. Furthermore, this version of the biplot also solves the problem in correspondence analysis of low-frequency categories that are located on the periphery of the map, giving the false impression that they are important, when they are in fact contributing minimally to the solution.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A biplot, which is the multivariate generalization of the two-variable scatterplot, can be used to visualize the results of many multivariate techniques, especially those that are based on the singular value decomposition. We consider data sets consisting of continuous-scale measurements, their fuzzy coding and the biplots that visualize them, using a fuzzy version of multiple correspondence analysis. Of special interest is the way quality of fit of the biplot is measured, since it is well-known that regular (i.e., crisp) multiple correspondence analysis seriously under-estimates this measure. We show how the results of fuzzy multiple correspondence analysis can be defuzzified to obtain estimated values of the original data, and prove that this implies an orthogonal decomposition of variance. This permits a measure of fit to be calculated in the familiar form of a percentage of explained variance, which is directly comparable to the corresponding fit measure used in principal component analysis of the original data. The approach is motivated initially by its application to a simulated data set, showing how the fuzzy approach can lead to diagnosing nonlinear relationships, and finally it is applied to a real set of meteorological data.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The singular value decomposition and its interpretation as alinear biplot has proved to be a powerful tool for analysing many formsof multivariate data. Here we adapt biplot methodology to the specifficcase of compositional data consisting of positive vectors each of whichis constrained to have unit sum. These relative variation biplots haveproperties relating to special features of compositional data: the studyof ratios, subcompositions and models of compositional relationships. Themethodology is demonstrated on a data set consisting of six-part colourcompositions in 22 abstract paintings, showing how the singular valuedecomposition can achieve an accurate biplot of the colour ratios and howpossible models interrelating the colours can be diagnosed.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We construct a weighted Euclidean distance that approximates any distance or dissimilarity measure between individuals that is based on a rectangular cases-by-variables data matrix. In contrast to regular multidimensional scaling methods for dissimilarity data, the method leads to biplots of individuals and variables while preserving all the good properties of dimension-reduction methods that are based on the singular-value decomposition. The main benefits are the decomposition of variance into components along principal axes, which provide the numerical diagnostics known as contributions, and the estimation of nonnegative weights for each variable. The idea is inspired by the distance functions used in correspondence analysis and in principal component analysis of standardized data, where the normalizations inherent in the distances can be considered as differential weighting of the variables. In weighted Euclidean biplots we allow these weights to be unknown parameters, which are estimated from the data to maximize the fit to the chosen distances or dissimilarities. These weights are estimated using a majorization algorithm. Once this extra weight-estimation step is accomplished, the procedure follows the classical path in decomposing the matrix and displaying its rows and columns in biplots.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The correlation between the species composition of pasture communities and soil properties in Plana de Vic has been studied using two multivariate methods, Correspondence Analysis (CA) for the vegetation data and Principal Component Analysis (PCA) for the soil data. To analyse the pastures, we took 144 vegetation relevés (comprising 201 species) that have been classified into 10 phytocoenological communities elsewhere. Most of these communities are almost entirely built up by perennials, ranging from xerophilous, clearly Mediterranean, to mesophilous, related to medium-European pastures, but a few occurring in shallow soils are dominated by therophytes. As for the soil properties, we analysed texture, pH, depth, bulk density, organic matter, C/N ratio and the carbonates content of 25 samples, correspondingto representative relevés of the communities studied.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A five year program of systematic multi-element geochemical exploration of the Catalonian Coastal Ranges has been initiated by the Geological Survey of Autonomic Government of Catalonia (Generalitat de Catalunya) and the Department of Geological and Geophysical Exploration (University of Barcelona). This paper reports the first stage results of this regional survey, covering an area of 530 km2 in the Montseny Mountains, NE of Barcelona (Spain). Stream sediments for metals and stream waters for fluoride were chosen because of the regional characteristics. Four target areas for future tactic survey were recognized after the prospect. The most important is a 40 km* zone in the Canoves-Vilamajor area, with high base metal values accompanied by Cd, Ni, Co, As and Sb anomalies. Keywords: Catalanides. Geochemical exploration. Stream sediments. Base metal anomalies. Principal Component Analysis.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this work we present a simulation of a recognition process with perimeter characterization of a simple plant leaves as a unique discriminating parameter. Data coding allowing for independence of leaves size and orientation may penalize performance recognition for some varieties. Border description sequences are then used, and Principal Component Analysis (PCA) is applied in order to study which is the best number of components for the classification task, implemented by means of a Support Vector Machine (SVM) System. Obtained results are satisfactory, and compared with [4] our system improves the recognition success, diminishing the variance at the same time.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The performance of high resolution accurate mass spectrometry (HRMS) operating in full scan MS mode was investigated for the quantitative determination of amoxicillin (AMX) as well as qualitative analysis of metabolomic profiles in tissues of medicated chickens. The metabolomic approach was exploited to compile analytical information on changes in the metabolome of muscle, kidney and liver from chickens subjected to a pharmacological program with AMX. Data consisting of m/z features taken throughout the entire chromatogram were extracted and filtered to be treated by Principal Component Analysis. As a result, it was found that medicated and non-treated animals were clearly clustered in distinct groups. Besides, the multivariate analysis revealed some relevant mass features contributing to this separation. In this context, recognizing those potential markers of each chicken class was a priority research for both metabolite identification and, obviously, evaluation of food quality and health effects associated to food consumption.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Background: Differences in the distribution of genotypes between individuals of the same ethnicity are an important confounder factor commonly undervalued in typical association studies conducted in radiogenomics. Objective: To evaluate the genotypic distribution of SNPs in a wide set of Spanish prostate cancer patients for determine the homogeneity of the population and to disclose potential bias. Design, Setting, and Participants: A total of 601 prostate cancer patients from Andalusia, Basque Country, Canary and Catalonia were genotyped for 10 SNPs located in 6 different genes associated to DNA repair: XRCC1 (rs25487, rs25489, rs1799782), ERCC2 (rs13181), ERCC1 (rs11615), LIG4 (rs1805388, rs1805386), ATM (rs17503908, rs1800057) and P53 (rs1042522). The SNP genotyping was made in a Biotrove OpenArrayH NT Cycler. Outcome Measurements and Statistical Analysis: Comparisons of genotypic and allelic frequencies among populations, as well as haplotype analyses were determined using the web-based environment SNPator. Principal component analysis was made using the SnpMatrix and XSnpMatrix classes and methods implemented as an R package. Non-supervised hierarchical cluster of SNP was made using MultiExperiment Viewer. Results and Limitations: We observed that genotype distribution of 4 out 10 SNPs was statistically different among the studied populations, showing the greatest differences between Andalusia and Catalonia. These observations were confirmed in cluster analysis, principal component analysis and in the differential distribution of haplotypes among the populations. Because tumor characteristics have not been taken into account, it is possible that some polymorphisms may influence tumor characteristics in the same way that it may pose a risk factor for other disease characteristics. Conclusion: Differences in distribution of genotypes within different populations of the same ethnicity could be an important confounding factor responsible for the lack of validation of SNPs associated with radiation-induced toxicity, especially when extensive meta-analysis with subjects from different countries are carried out.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Web 2.0 services such as social bookmarking allow users to manage and share the links they find interesting, adding their own tags for describingthem. This is especially interesting in the field of open educational resources, asdelicious is a simple way to bridge the institutional point of view (i.e. learningobject repositories) with the individual one (i.e. personal collections), thuspromoting the discovering and sharing of such resources by other users. In this paper we propose a methodology for analyzing such tags in order to discover hidden semantics (i.e. taxonomies and vocabularies) that can be used toimprove descriptions of learning objects and make learning object repositories more visible and discoverable. We propose the use of a simple statistical analysis tool such as principal component analysis to discover which tags createclusters that can be semantically interpreted. We will compare the obtained results with a collection of resources related to open educational resources, in order to better understand the real needs of people searching for open educational resources.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We performed a spatiotemporal analysis of a network of 21 Scots pine (Pinus sylvestris) ring-width chronologies in northern Fennoscandia by means of chronology statistics and multivariate analyses. Chronologies are located on both sides (western and eastern) of the Scandes Mountains (67°N-70°N, 15°E-29°E). Growth relationships with temperature, precipitation, and North Atlantic Oscillation (NAO) indices were calculated for the period 1880-1991. We also assessed their temporal stability. Current July temperature and, to a lesser degree, May precipitation are the main growth limiting factors in the whole area of study. However, Principal Component Analysis (PCA) and mean interseries correlation revealed differences in radial growth between both sides of the Scandes Mountains, attributed to the Oceanic-Continental climatic gradient in the area. The gradient signal is temporally variable and has strengthened during the second half of the 20th century. Northern Fennoscandia Scots pine growth is positively related to early winter NAO indices previous to the growth season and to late spring NAO. NAO/growth relationships are unstable and have dropped in the second half of the 20th century. Moreover, they are noncontinuous through the range of NAO values: for early winter, only positive NAO indices enhance tree growth in the next growing season, while negative NAO does not. For spring, only negative NAO is correlated with radial growth.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Background: Differences in the distribution of genotypes between individuals of the same ethnicity are an important confounder factor commonly undervalued in typical association studies conducted in radiogenomics. Objective: To evaluate the genotypic distribution of SNPs in a wide set of Spanish prostate cancer patients for determine the homogeneity of the population and to disclose potential bias. Design, Setting, and Participants: A total of 601 prostate cancer patients from Andalusia, Basque Country, Canary and Catalonia were genotyped for 10 SNPs located in 6 different genes associated to DNA repair: XRCC1 (rs25487, rs25489, rs1799782), ERCC2 (rs13181), ERCC1 (rs11615), LIG4 (rs1805388, rs1805386), ATM (rs17503908, rs1800057) and P53 (rs1042522). The SNP genotyping was made in a Biotrove OpenArrayH NT Cycler. Outcome Measurements and Statistical Analysis: Comparisons of genotypic and allelic frequencies among populations, as well as haplotype analyses were determined using the web-based environment SNPator. Principal component analysis was made using the SnpMatrix and XSnpMatrix classes and methods implemented as an R package. Non-supervised hierarchical cluster of SNP was made using MultiExperiment Viewer. Results and Limitations: We observed that genotype distribution of 4 out 10 SNPs was statistically different among the studied populations, showing the greatest differences between Andalusia and Catalonia. These observations were confirmed in cluster analysis, principal component analysis and in the differential distribution of haplotypes among the populations. Because tumor characteristics have not been taken into account, it is possible that some polymorphisms may influence tumor characteristics in the same way that it may pose a risk factor for other disease characteristics. Conclusion: Differences in distribution of genotypes within different populations of the same ethnicity could be an important confounding factor responsible for the lack of validation of SNPs associated with radiation-induced toxicity, especially when extensive meta-analysis with subjects from different countries are carried out.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Objective: To compare lower incisor dentoalveolar compensation and mandible symphysis morphology among Class I and Class III malocclusion patients with different facial vertical skeletal patterns. Materials and Methods: Lower incisor extrusion and inclination, as well as buccal (LA) and lingual (LP) cortex depth, and mandibular symphysis height (LH) were measured in 107 lateral cephalometric x-rays of adult patients without prior orthodontic treatment. In addition, malocclusion type (Class I or III) and facial vertical skeletal pattern were considered. Through a principal component analysis (PCA) related variables were reduced. Simple regression equation and multivariate analyses of variance were also used. Results: Incisor mandibular plane angle (P < .001) and extrusion (P  =  .03) values showed significant differences between the sagittal malocclusion groups. Variations in the mandibular plane have a negative correlation with LA (Class I P  =  .03 and Class III P  =  .01) and a positive correlation with LH (Class I P  =  .01 and Class III P  =  .02) in both groups. Within the Class III group, there was a negative correlation between the mandibular plane and LP (P  =  .02). PCA showed that the tendency toward a long face causes the symphysis to elongate and narrow. In Class III, alveolar narrowing is also found in normal faces. Conclusions: Vertical facial pattern is a significant factor in mandibular symphysis alveolar morphology and lower incisor positioning, both for Class I and Class III patients. Short-faced Class III patients have a widened alveolar bone. However, for long-faced and normal-faced Class III, natural compensation elongates the symphysis and influences lower incisor position.