951 resultados para principal component analysis (PCA)


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Rho GTPases are conformational switches that control a wide variety of signaling pathways critical for eukaryotic cell development and proliferation. They represent attractive targets for drug design as their aberrant function and deregulated activity is associated with many human diseases including cancer. Extensive high-resolution structures (.100) and recent mutagenesis studies have laid the foundation for the design of new structure-based chemotherapeutic strategies. Although the inhibition of Rho signaling with drug-like compounds is an active area of current research, very little attention has been devoted to directly inhibiting Rho by targeting potential allosteric non-nucleotide binding sites. By avoiding the nucleotide binding site, compounds may minimize the potential for undesirable off-target interactions with other ubiquitous GTP and ATP binding proteins. Here we describe the application of molecular dynamics simulations, principal component analysis, sequence conservation analysis, and ensemble small-molecule fragment mapping to provide an extensive mapping of potential small-molecule binding pockets on Rho family members. Characterized sites include novel pockets in the vicinity of the conformationaly responsive switch regions as well as distal sites that appear to be related to the conformations of the nucleotide binding region. Furthermore the use of accelerated molecular dynamics simulation, an advanced sampling method that extends the accessible time-scale of conventional simulations, is found to enhance the characterization of novel binding sites when conformational changes are important for the protein mechanism.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Els avenços en tècniques de genotipat de polimorfismes genètics a gran escala estan liderant una revolució en el camp de l’epidemiologia genètica i la genètica de poblacions humanes. La informació aportada per aquestes tècniques ha evidenciat l’existència d’estructuracions poblacionals que poden augmentar l’error en els estudis d’associació a escala genòmica (GWAS, genome-wide association studies). Estudis recents han demostrat la presència d’aquestes estructuracions a nivell interregional i intrarregional a Europa. El present projecte ha avaluat el grau d’estructuració genètica en poblacions de la Península Ibèrica i altres regions del sudoest europeu (Itàlia i França) per quantificar l’impacte que aquesta potencial estructuració pot tenir en el disseny d’estudis d’associació GWAS i reconstruir la història demogràfica de les poblacions de la Mediterrània. Per aconseguir aquests objectius, s’han analitzat mostres de DNA de 770 individus de 26 poblacions de la Península Ibèrica, França, Itàlia i d’altres països de la Mediterrània. Aquestes mostres van ser genotipades per 240000 SNPs utilitzant l’array 250K StyI d’Affymetrix en el marc d’aquest projecte o mitjançant altres arrays d’Affymetrix en els projectes internacionals HapMap i POPRES. S’han realitzat anàlisis estadístiques incloent anàlisis de components principals, Fst, identitat per descendència, desequilibri de lligament, barreres genètiques, etc. Aquests resultats han permés construir un marc de referència de la variabilitat en aquesta regió, avaluar el seu impacte en estudis d’associació i proposar mesures per evitar l’increment de qualsevol tipus d’error (tipus I i II) en estudis nacionals i internacionals. A més, també han permés reconstruir la història de les poblacions humanes de la Mediterrània així com analitzar les seves relacions demogràfiques. Donada la duració limitada d’aquesta acció (24 mesos, d’octubre de 2010 a setembre de 2012), els resultats d’aquest projecte es troben actualment en fase de redacció i conduiran a diverses publicacions en revistes internacionals i a la preparació de comunicacions a congressos.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The fatty acids from cocoa butters of different origins, varieties, and suppliers and a number of cocoa butter equivalents (Illexao 30-61, Illexao 30-71, Illexao 30-96, Choclin, Coberine, Chocosine-Illipe, Chocosine-Shea, Shokao, Akomax, Akonord, and Ertina) were investigated by bulk stable carbon isotope analysis and compound specific isotope analysis. The interpretation is based on principal component analysis combining the fatty acid concentrations and the bulk and molecular isotopic data. The scatterplot of the two first principal components allowed detection of the addition of vegetable fats to cocoa butters. Enrichment in heavy carbon isotope (C-13) of the bulk cocoa butter and of the individual fatty acids is related to mixing with other vegetable fats and possibly to thermally or oxidatively induced degradation during processing (e.g., drying and roasting of the cocoa beans or deodorization of the pressed fat) or storage. The feasibility of the analytical approach for authenticity assessment is discussed.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Biplots are graphical displays of data matrices based on the decomposition of a matrix as the product of two matrices. Elements of these two matrices are used as coordinates for the rows and columns of the data matrix, with an interpretation of the joint presentation that relies on the properties of the scalar product. Because the decomposition is not unique, there are several alternative ways to scale the row and column points of the biplot, which can cause confusion amongst users, especially when software packages are not united in their approach to this issue. We propose a new scaling of the solution, called the standard biplot, which applies equally well to a wide variety of analyses such as correspondence analysis, principal component analysis, log-ratio analysis and the graphical results of a discriminant analysis/MANOVA, in fact to any method based on the singular-value decomposition. The standard biplot also handles data matrices with widely different levels of inherent variance. Two concepts taken from correspondence analysis are important to this idea: the weighting of row and column points, and the contributions made by the points to the solution. In the standard biplot one set of points, usually the rows of the data matrix, optimally represent the positions of the cases or sample units, which are weighted and usually standardized in some way unless the matrix contains values that are comparable in their raw form. The other set of points, usually the columns, is represented in accordance with their contributions to the low-dimensional solution. As for any biplot, the projections of the row points onto vectors defined by the column points approximate the centred and (optionally) standardized data. The method is illustrated with several examples to demonstrate how the standard biplot copes in different situations to give a joint map which needs only one common scale on the principal axes, thus avoiding the problem of enlarging or contracting the scale of one set of points to make the biplot readable. The proposal also solves the problem in correspondence analysis of low-frequency categories that are located on the periphery of the map, giving the false impression that they are important.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Cape Verde is a tropical oceanic ecosystem, highly fragmented and dispersed, with islands physically isolated by distance and depth. To understand how isolation affects the ecological variability in this archipelago, we conducted a research project on the community structure of the 18 commercially most important demersal fishes. An index of ecological distance based on species relative dominance (Di) is developed from Catch Per Unit Effort, derived from an extensive database of artisanal fisheries. Two ecological measures of distance between islands are calculated: at the species level, DDi, and at the community level, DD (sum of DDi). A physical isolation factor (Idb) combining distance (d) and bathymetry (b) is proposed. Covariance analysis shows that isolation factor is positively correlated with both DDi and DD, suggesting that Idb can be considered as an ecological isolation factor. The effect of Idb varies with season and species. This effect is stronger in summer (May to November), than in winter (December to April), which appears to be more unstable. Species react differently to Idb, independently of season. A principal component analysis on the monthly (DDi) for the 12 islands and the 18 species, complemented by an agglomerative hierarchical clustering, shows a geographic pattern of island organization, according to Idb. Results indicate that the ecological structure of demersal fish communities of Cape Verde archipelago, both in time and space, can be explained by a geographic isolation factor. The analytical approach used here is promising and could be tested in other archipelago systems.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The Cape Verde Archipelago location and its biogeographical features are of special interest for Marine Ecology. However, there’s a lack of knowledge regarding the composition of the coastal ecosystems in this region, especially about benthic macroinvertebrates subtidal communities. Between August and October of 2007, eight locations around the island of São Vicente were sampled. Within each of those spots, fragments of substratum were collected and throughout the processing of the collected data, a total of 4032 individuals were counted, which belong to 81 different species. Shannon’s Entropy and Gini-Simpson’s diversity index were calculated, as the real number of species each one represented. By comparing the results, differences between sampling stations and between indices within the same sampling station were found. With the purpose of clustering the sampled locations according to the number of collected organisms by species, a dendrogram was elaborated and a principal component analysis was carried out. The considered sampling stations didn’t reveal significant differences according to the composition of their benthic macroinvertebrates subtidal communities in terms of great taxonomic groups or functional groups. It’s assumed that they differ only by minute traits.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Understanding the genetic structure of human populations is of fundamental interest to medical, forensic and anthropological sciences. Advances in high-throughput genotyping technology have markedly improved our understanding of global patterns of human genetic variation and suggest the potential to use large samples to uncover variation among closely spaced populations. Here we characterize genetic variation in a sample of 3,000 European individuals genotyped at over half a million variable DNA sites in the human genome. Despite low average levels of genetic differentiation among Europeans, we find a close correspondence between genetic and geographic distances; indeed, a geographical map of Europe arises naturally as an efficient two-dimensional summary of genetic variation in Europeans. The results emphasize that when mapping the genetic basis of a disease phenotype, spurious associations can arise if genetic structure is not properly accounted for. In addition, the results are relevant to the prospects of genetic ancestry testing; an individual's DNA can be used to infer their geographic origin with surprising accuracy-often to within a few hundred kilometres.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Using one male-inherited and eight biparentally inherited microsatellite markers, we investigate the population genetic structure of the Valais chromosome race of the common shrew (Sorex araneus) in the Central Alps of Europe. Unexpectedly, the Y-chromosome microsatellite suggests nearly complete absence of male gene flow among populations from the St-Bernard and Simplon regions (Switzerland). Autosomal markers also show significant genetic structuring among these two geographical areas. Isolation by distance is significant and possible barriers to gene flow exist in the study area. Two different approaches are used to better understand the geographical patterns and the causes of this structuring. Using a principal component analysis for which testing procedure exists, and partial Mantel tests, we show that the St-Bernard pass does not represent a significant barrier to gene flow although it culminates at 2469 m, close to the highest altitudinal record for this species. Similar results are found for the Simplon pass, indicating that both passes represented potential postglacial recolonization routes into Switzerland from Italian refugia after the last Pleistocene glaciations. In contrast with the weak effect of these mountain passes, the Rhône valley lowlands significantly reduce gene flow in this species. Natural obstacles (the large Rhône river) and unsuitable habitats (dry slopes) are both present in the valley. Moreover, anthropogenic changes to landscape structures are likely to have strongly reduced available habitats for this shrew in the lowlands, thereby promoting genetic differentiation of populations found on opposite sides of the Rhône valley.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In order to interpret the biplot it is necessary to know which points usually variables are the ones that are important contributors to the solution, and this information is available separately as part of the biplot s numerical results. We propose a new scaling of the display, called the contribution biplot, which incorporates this diagnostic directly into the graphical display, showing visually the important contributors and thus facilitating the biplot interpretation and often simplifying the graphical representation considerably. The contribution biplot can be applied to a wide variety of analyses such as correspondence analysis, principal component analysis, log-ratio analysis and the graphical results of a discriminant analysis/MANOVA, in fact to any method based on the singular-value decomposition. In the contribution biplot one set of points, usually the rows of the data matrix, optimally represent the spatial positions of the cases or sample units, according to some distance measure that usually incorporates some form of standardization unless all data are comparable in scale. The other set of points, usually the columns, is represented by vectors that are related to their contributions to the low-dimensional solution. A fringe benefit is that usually only one common scale for row and column points is needed on the principal axes, thus avoiding the problem of enlarging or contracting the scale of one set of points to make the biplot legible. Furthermore, this version of the biplot also solves the problem in correspondence analysis of low-frequency categories that are located on the periphery of the map, giving the false impression that they are important, when they are in fact contributing minimally to the solution.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A biplot, which is the multivariate generalization of the two-variable scatterplot, can be used to visualize the results of many multivariate techniques, especially those that are based on the singular value decomposition. We consider data sets consisting of continuous-scale measurements, their fuzzy coding and the biplots that visualize them, using a fuzzy version of multiple correspondence analysis. Of special interest is the way quality of fit of the biplot is measured, since it is well-known that regular (i.e., crisp) multiple correspondence analysis seriously under-estimates this measure. We show how the results of fuzzy multiple correspondence analysis can be defuzzified to obtain estimated values of the original data, and prove that this implies an orthogonal decomposition of variance. This permits a measure of fit to be calculated in the familiar form of a percentage of explained variance, which is directly comparable to the corresponding fit measure used in principal component analysis of the original data. The approach is motivated initially by its application to a simulated data set, showing how the fuzzy approach can lead to diagnosing nonlinear relationships, and finally it is applied to a real set of meteorological data.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The singular value decomposition and its interpretation as alinear biplot has proved to be a powerful tool for analysing many formsof multivariate data. Here we adapt biplot methodology to the specifficcase of compositional data consisting of positive vectors each of whichis constrained to have unit sum. These relative variation biplots haveproperties relating to special features of compositional data: the studyof ratios, subcompositions and models of compositional relationships. Themethodology is demonstrated on a data set consisting of six-part colourcompositions in 22 abstract paintings, showing how the singular valuedecomposition can achieve an accurate biplot of the colour ratios and howpossible models interrelating the colours can be diagnosed.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We construct a weighted Euclidean distance that approximates any distance or dissimilarity measure between individuals that is based on a rectangular cases-by-variables data matrix. In contrast to regular multidimensional scaling methods for dissimilarity data, the method leads to biplots of individuals and variables while preserving all the good properties of dimension-reduction methods that are based on the singular-value decomposition. The main benefits are the decomposition of variance into components along principal axes, which provide the numerical diagnostics known as contributions, and the estimation of nonnegative weights for each variable. The idea is inspired by the distance functions used in correspondence analysis and in principal component analysis of standardized data, where the normalizations inherent in the distances can be considered as differential weighting of the variables. In weighted Euclidean biplots we allow these weights to be unknown parameters, which are estimated from the data to maximize the fit to the chosen distances or dissimilarities. These weights are estimated using a majorization algorithm. Once this extra weight-estimation step is accomplished, the procedure follows the classical path in decomposing the matrix and displaying its rows and columns in biplots.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Changes in gene expression are thought to underlie many of the phenotypic differences between species. However, large-scale analyses of gene expression evolution were until recently prevented by technological limitations. Here we report the sequencing of polyadenylated RNA from six organs across ten species that represent all major mammalian lineages (placentals, marsupials and monotremes) and birds (the evolutionary outgroup), with the goal of understanding the dynamics of mammalian transcriptome evolution. We show that the rate of gene expression evolution varies among organs, lineages and chromosomes, owing to differences in selective pressures: transcriptome change was slow in nervous tissues and rapid in testes, slower in rodents than in apes and monotremes, and rapid for the X chromosome right after its formation. Although gene expression evolution in mammals was strongly shaped by purifying selection, we identify numerous potentially selectively driven expression switches, which occurred at different rates across lineages and tissues and which probably contributed to the specific organ biology of various mammals.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Functionally relevant large scale brain dynamics operates within the framework imposed by anatomical connectivity and time delays due to finite transmission speeds. To gain insight on the reliability and comparability of large scale brain network simulations, we investigate the effects of variations in the anatomical connectivity. Two different sets of detailed global connectivity structures are explored, the first extracted from the CoCoMac database and rescaled to the spatial extent of the human brain, the second derived from white-matter tractography applied to diffusion spectrum imaging (DSI) for a human subject. We use the combination of graph theoretical measures of the connection matrices and numerical simulations to explicate the importance of both connectivity strength and delays in shaping dynamic behaviour. Our results demonstrate that the brain dynamics derived from the CoCoMac database are more complex and biologically more realistic than the one based on the DSI database. We propose that the reason for this difference is the absence of directed weights in the DSI connectivity matrix.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Our current knowledge of the general factor requirement in transcription by the three mammalian RNA polymerases is based on a small number of model promoters. Here, we present a comprehensive chromatin immunoprecipitation (ChIP)-on-chip analysis for 28 transcription factors on a large set of known and novel TATA-binding protein (TBP)-binding sites experimentally identified via ChIP cloning. A large fraction of identified TBP-binding sites is located in introns or lacks a gene/mRNA annotation and is found to direct transcription. Integrated analysis of the ChIP-on-chip data and functional studies revealed that TAF12 hitherto regarded as RNA polymerase II (RNAP II)-specific was found to be also involved in RNAP I transcription. Distinct profiles for general transcription factors and TAF-containing complexes were uncovered for RNAP II promoters located in CpG and non-CpG islands suggesting distinct transcription initiation pathways. Our study broadens the spectrum of general transcription factor function and uncovers a plethora of novel, functional TBP-binding sites in the human genome.