46 resultados para principal component analysis (PCA)


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Rho GTPases are conformational switches that control a wide variety of signaling pathways critical for eukaryotic cell development and proliferation. They represent attractive targets for drug design as their aberrant function and deregulated activity is associated with many human diseases including cancer. Extensive high-resolution structures (.100) and recent mutagenesis studies have laid the foundation for the design of new structure-based chemotherapeutic strategies. Although the inhibition of Rho signaling with drug-like compounds is an active area of current research, very little attention has been devoted to directly inhibiting Rho by targeting potential allosteric non-nucleotide binding sites. By avoiding the nucleotide binding site, compounds may minimize the potential for undesirable off-target interactions with other ubiquitous GTP and ATP binding proteins. Here we describe the application of molecular dynamics simulations, principal component analysis, sequence conservation analysis, and ensemble small-molecule fragment mapping to provide an extensive mapping of potential small-molecule binding pockets on Rho family members. Characterized sites include novel pockets in the vicinity of the conformationaly responsive switch regions as well as distal sites that appear to be related to the conformations of the nucleotide binding region. Furthermore the use of accelerated molecular dynamics simulation, an advanced sampling method that extends the accessible time-scale of conventional simulations, is found to enhance the characterization of novel binding sites when conformational changes are important for the protein mechanism.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Els avenços en tècniques de genotipat de polimorfismes genètics a gran escala estan liderant una revolució en el camp de l’epidemiologia genètica i la genètica de poblacions humanes. La informació aportada per aquestes tècniques ha evidenciat l’existència d’estructuracions poblacionals que poden augmentar l’error en els estudis d’associació a escala genòmica (GWAS, genome-wide association studies). Estudis recents han demostrat la presència d’aquestes estructuracions a nivell interregional i intrarregional a Europa. El present projecte ha avaluat el grau d’estructuració genètica en poblacions de la Península Ibèrica i altres regions del sudoest europeu (Itàlia i França) per quantificar l’impacte que aquesta potencial estructuració pot tenir en el disseny d’estudis d’associació GWAS i reconstruir la història demogràfica de les poblacions de la Mediterrània. Per aconseguir aquests objectius, s’han analitzat mostres de DNA de 770 individus de 26 poblacions de la Península Ibèrica, França, Itàlia i d’altres països de la Mediterrània. Aquestes mostres van ser genotipades per 240000 SNPs utilitzant l’array 250K StyI d’Affymetrix en el marc d’aquest projecte o mitjançant altres arrays d’Affymetrix en els projectes internacionals HapMap i POPRES. S’han realitzat anàlisis estadístiques incloent anàlisis de components principals, Fst, identitat per descendència, desequilibri de lligament, barreres genètiques, etc. Aquests resultats han permés construir un marc de referència de la variabilitat en aquesta regió, avaluar el seu impacte en estudis d’associació i proposar mesures per evitar l’increment de qualsevol tipus d’error (tipus I i II) en estudis nacionals i internacionals. A més, també han permés reconstruir la història de les poblacions humanes de la Mediterrània així com analitzar les seves relacions demogràfiques. Donada la duració limitada d’aquesta acció (24 mesos, d’octubre de 2010 a setembre de 2012), els resultats d’aquest projecte es troben actualment en fase de redacció i conduiran a diverses publicacions en revistes internacionals i a la preparació de comunicacions a congressos.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Biplots are graphical displays of data matrices based on the decomposition of a matrix as the product of two matrices. Elements of these two matrices are used as coordinates for the rows and columns of the data matrix, with an interpretation of the joint presentation that relies on the properties of the scalar product. Because the decomposition is not unique, there are several alternative ways to scale the row and column points of the biplot, which can cause confusion amongst users, especially when software packages are not united in their approach to this issue. We propose a new scaling of the solution, called the standard biplot, which applies equally well to a wide variety of analyses such as correspondence analysis, principal component analysis, log-ratio analysis and the graphical results of a discriminant analysis/MANOVA, in fact to any method based on the singular-value decomposition. The standard biplot also handles data matrices with widely different levels of inherent variance. Two concepts taken from correspondence analysis are important to this idea: the weighting of row and column points, and the contributions made by the points to the solution. In the standard biplot one set of points, usually the rows of the data matrix, optimally represent the positions of the cases or sample units, which are weighted and usually standardized in some way unless the matrix contains values that are comparable in their raw form. The other set of points, usually the columns, is represented in accordance with their contributions to the low-dimensional solution. As for any biplot, the projections of the row points onto vectors defined by the column points approximate the centred and (optionally) standardized data. The method is illustrated with several examples to demonstrate how the standard biplot copes in different situations to give a joint map which needs only one common scale on the principal axes, thus avoiding the problem of enlarging or contracting the scale of one set of points to make the biplot readable. The proposal also solves the problem in correspondence analysis of low-frequency categories that are located on the periphery of the map, giving the false impression that they are important.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In order to interpret the biplot it is necessary to know which points usually variables are the ones that are important contributors to the solution, and this information is available separately as part of the biplot s numerical results. We propose a new scaling of the display, called the contribution biplot, which incorporates this diagnostic directly into the graphical display, showing visually the important contributors and thus facilitating the biplot interpretation and often simplifying the graphical representation considerably. The contribution biplot can be applied to a wide variety of analyses such as correspondence analysis, principal component analysis, log-ratio analysis and the graphical results of a discriminant analysis/MANOVA, in fact to any method based on the singular-value decomposition. In the contribution biplot one set of points, usually the rows of the data matrix, optimally represent the spatial positions of the cases or sample units, according to some distance measure that usually incorporates some form of standardization unless all data are comparable in scale. The other set of points, usually the columns, is represented by vectors that are related to their contributions to the low-dimensional solution. A fringe benefit is that usually only one common scale for row and column points is needed on the principal axes, thus avoiding the problem of enlarging or contracting the scale of one set of points to make the biplot legible. Furthermore, this version of the biplot also solves the problem in correspondence analysis of low-frequency categories that are located on the periphery of the map, giving the false impression that they are important, when they are in fact contributing minimally to the solution.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A biplot, which is the multivariate generalization of the two-variable scatterplot, can be used to visualize the results of many multivariate techniques, especially those that are based on the singular value decomposition. We consider data sets consisting of continuous-scale measurements, their fuzzy coding and the biplots that visualize them, using a fuzzy version of multiple correspondence analysis. Of special interest is the way quality of fit of the biplot is measured, since it is well-known that regular (i.e., crisp) multiple correspondence analysis seriously under-estimates this measure. We show how the results of fuzzy multiple correspondence analysis can be defuzzified to obtain estimated values of the original data, and prove that this implies an orthogonal decomposition of variance. This permits a measure of fit to be calculated in the familiar form of a percentage of explained variance, which is directly comparable to the corresponding fit measure used in principal component analysis of the original data. The approach is motivated initially by its application to a simulated data set, showing how the fuzzy approach can lead to diagnosing nonlinear relationships, and finally it is applied to a real set of meteorological data.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The singular value decomposition and its interpretation as alinear biplot has proved to be a powerful tool for analysing many formsof multivariate data. Here we adapt biplot methodology to the specifficcase of compositional data consisting of positive vectors each of whichis constrained to have unit sum. These relative variation biplots haveproperties relating to special features of compositional data: the studyof ratios, subcompositions and models of compositional relationships. Themethodology is demonstrated on a data set consisting of six-part colourcompositions in 22 abstract paintings, showing how the singular valuedecomposition can achieve an accurate biplot of the colour ratios and howpossible models interrelating the colours can be diagnosed.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We construct a weighted Euclidean distance that approximates any distance or dissimilarity measure between individuals that is based on a rectangular cases-by-variables data matrix. In contrast to regular multidimensional scaling methods for dissimilarity data, the method leads to biplots of individuals and variables while preserving all the good properties of dimension-reduction methods that are based on the singular-value decomposition. The main benefits are the decomposition of variance into components along principal axes, which provide the numerical diagnostics known as contributions, and the estimation of nonnegative weights for each variable. The idea is inspired by the distance functions used in correspondence analysis and in principal component analysis of standardized data, where the normalizations inherent in the distances can be considered as differential weighting of the variables. In weighted Euclidean biplots we allow these weights to be unknown parameters, which are estimated from the data to maximize the fit to the chosen distances or dissimilarities. These weights are estimated using a majorization algorithm. Once this extra weight-estimation step is accomplished, the procedure follows the classical path in decomposing the matrix and displaying its rows and columns in biplots.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A five year program of systematic multi-element geochemical exploration of the Catalonian Coastal Ranges has been initiated by the Geological Survey of Autonomic Government of Catalonia (Generalitat de Catalunya) and the Department of Geological and Geophysical Exploration (University of Barcelona). This paper reports the first stage results of this regional survey, covering an area of 530 km2 in the Montseny Mountains, NE of Barcelona (Spain). Stream sediments for metals and stream waters for fluoride were chosen because of the regional characteristics. Four target areas for future tactic survey were recognized after the prospect. The most important is a 40 km* zone in the Canoves-Vilamajor area, with high base metal values accompanied by Cd, Ni, Co, As and Sb anomalies. Keywords: Catalanides. Geochemical exploration. Stream sediments. Base metal anomalies. Principal Component Analysis.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The performance of high resolution accurate mass spectrometry (HRMS) operating in full scan MS mode was investigated for the quantitative determination of amoxicillin (AMX) as well as qualitative analysis of metabolomic profiles in tissues of medicated chickens. The metabolomic approach was exploited to compile analytical information on changes in the metabolome of muscle, kidney and liver from chickens subjected to a pharmacological program with AMX. Data consisting of m/z features taken throughout the entire chromatogram were extracted and filtered to be treated by Principal Component Analysis. As a result, it was found that medicated and non-treated animals were clearly clustered in distinct groups. Besides, the multivariate analysis revealed some relevant mass features contributing to this separation. In this context, recognizing those potential markers of each chicken class was a priority research for both metabolite identification and, obviously, evaluation of food quality and health effects associated to food consumption.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Background: Differences in the distribution of genotypes between individuals of the same ethnicity are an important confounder factor commonly undervalued in typical association studies conducted in radiogenomics. Objective: To evaluate the genotypic distribution of SNPs in a wide set of Spanish prostate cancer patients for determine the homogeneity of the population and to disclose potential bias. Design, Setting, and Participants: A total of 601 prostate cancer patients from Andalusia, Basque Country, Canary and Catalonia were genotyped for 10 SNPs located in 6 different genes associated to DNA repair: XRCC1 (rs25487, rs25489, rs1799782), ERCC2 (rs13181), ERCC1 (rs11615), LIG4 (rs1805388, rs1805386), ATM (rs17503908, rs1800057) and P53 (rs1042522). The SNP genotyping was made in a Biotrove OpenArrayH NT Cycler. Outcome Measurements and Statistical Analysis: Comparisons of genotypic and allelic frequencies among populations, as well as haplotype analyses were determined using the web-based environment SNPator. Principal component analysis was made using the SnpMatrix and XSnpMatrix classes and methods implemented as an R package. Non-supervised hierarchical cluster of SNP was made using MultiExperiment Viewer. Results and Limitations: We observed that genotype distribution of 4 out 10 SNPs was statistically different among the studied populations, showing the greatest differences between Andalusia and Catalonia. These observations were confirmed in cluster analysis, principal component analysis and in the differential distribution of haplotypes among the populations. Because tumor characteristics have not been taken into account, it is possible that some polymorphisms may influence tumor characteristics in the same way that it may pose a risk factor for other disease characteristics. Conclusion: Differences in distribution of genotypes within different populations of the same ethnicity could be an important confounding factor responsible for the lack of validation of SNPs associated with radiation-induced toxicity, especially when extensive meta-analysis with subjects from different countries are carried out.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Web 2.0 services such as social bookmarking allow users to manage and share the links they find interesting, adding their own tags for describingthem. This is especially interesting in the field of open educational resources, asdelicious is a simple way to bridge the institutional point of view (i.e. learningobject repositories) with the individual one (i.e. personal collections), thuspromoting the discovering and sharing of such resources by other users. In this paper we propose a methodology for analyzing such tags in order to discover hidden semantics (i.e. taxonomies and vocabularies) that can be used toimprove descriptions of learning objects and make learning object repositories more visible and discoverable. We propose the use of a simple statistical analysis tool such as principal component analysis to discover which tags createclusters that can be semantically interpreted. We will compare the obtained results with a collection of resources related to open educational resources, in order to better understand the real needs of people searching for open educational resources.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We present in this paper the results of the application of several visual methods on a group of locations, dated between VI and I centuries BC, of the ager Tarraconensis (Tarragona, Spain) a Hinterland of the roman colony of Tarraco. The difficulty in interpreting the diverse results in a combined way has been resolved by means of the use of statistical methods, such as Principal Components Analysis (PCA) and K-means clustering analysis. These methods have allowed us to carry out site classifications in function of the landscape's visual structure that contains them and of the visual relationships that could be given among them.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Near-infrared spectroscopy (NIRS) was used to analyse the crude protein content of dried and milled samples of wheat and to discriminate samples according to their stage of growth. A calibration set of 72 samples from three growth stages of wheat (tillering, heading and harvest) and a validation set of 28 samples was collected for this purpose. Principal components analysis (PCA) of the calibration set discriminated groups of samples according to the growth stage of the wheat. Based on these differences, a classification procedure (SIMCA) showed a very accurate classification of the validation set samples : all of them were successfully classified in each group using this procedure when both the residual and the leverage were used in the classification criteria. Looking only at the residuals all the samples were also correctly classified except one of tillering stage that was assigned to both tillering and heading stages. Finally, the determination of the crude protein content of these samples was considered in two ways: building up a global model for all the growth stages, and building up local models for each stage, separately. The best prediction results for crude protein were obtained using a global model for samples in the two first growth stages (tillering and heading), and using a local model for the harvest stage samples.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Background: Differences in the distribution of genotypes between individuals of the same ethnicity are an important confounder factor commonly undervalued in typical association studies conducted in radiogenomics. Objective: To evaluate the genotypic distribution of SNPs in a wide set of Spanish prostate cancer patients for determine the homogeneity of the population and to disclose potential bias. Design, Setting, and Participants: A total of 601 prostate cancer patients from Andalusia, Basque Country, Canary and Catalonia were genotyped for 10 SNPs located in 6 different genes associated to DNA repair: XRCC1 (rs25487, rs25489, rs1799782), ERCC2 (rs13181), ERCC1 (rs11615), LIG4 (rs1805388, rs1805386), ATM (rs17503908, rs1800057) and P53 (rs1042522). The SNP genotyping was made in a Biotrove OpenArrayH NT Cycler. Outcome Measurements and Statistical Analysis: Comparisons of genotypic and allelic frequencies among populations, as well as haplotype analyses were determined using the web-based environment SNPator. Principal component analysis was made using the SnpMatrix and XSnpMatrix classes and methods implemented as an R package. Non-supervised hierarchical cluster of SNP was made using MultiExperiment Viewer. Results and Limitations: We observed that genotype distribution of 4 out 10 SNPs was statistically different among the studied populations, showing the greatest differences between Andalusia and Catalonia. These observations were confirmed in cluster analysis, principal component analysis and in the differential distribution of haplotypes among the populations. Because tumor characteristics have not been taken into account, it is possible that some polymorphisms may influence tumor characteristics in the same way that it may pose a risk factor for other disease characteristics. Conclusion: Differences in distribution of genotypes within different populations of the same ethnicity could be an important confounding factor responsible for the lack of validation of SNPs associated with radiation-induced toxicity, especially when extensive meta-analysis with subjects from different countries are carried out.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Macrofossil analysis of a composite 19 m long sediment core from Rano Raraku Lake (Easter Island)was related to litho-sedimentary and geochemical features of the sediment. Strong stratigraphical patterns are shown by indirect gradient analyses of the data. The good correspondence between the stratigraphical patterns derived from macrofossil (Correspondence Analysis) and sedimentary and geochemical data (Principal Component Analysis) shows that macrofossil associations provide sound palaeolimnological information in conjunction with sedimentary data. The main taphonomic factors in fluencing the macrofossil assemblages are run-off from the catchment, the littoral plant belt, and the depositional environment within the basin. Five main stages during the last 34,000 calibrated years BP (cal yr BP) are characterised from the lithological, geochemical, and macrofossil data. From 34 to 14.6 cal kyr BP (last glacial period) the sediments were largely derived from the catchment, indicating a high energy lake environment with much erosion and run-off bringing abundant plant trichomes, lichens, and mosses into the centre of Raraku Lake.