958 resultados para PPM-whether molecular biosciences treat compositional data right


Relevância:

100.00% 100.00%

Publicador:

Resumo:

In an earlier investigation (Burger et al., 2000) five sediment cores near the Rodrigues Triple Junction in the Indian Ocean were studied applying classical statistical methods (fuzzy c-means clustering, linear mixing model, principal component analysis) for the extraction of endmembers and evaluating the spatial and temporal variation of geochemical signals. Three main factors of sedimentation were expected by the marine geologists: a volcano-genetic, a hydro-hydrothermal and an ultra-basic factor. The display of fuzzy membership values and/or factor scores versus depth provided consistent results for two factors only; the ultra-basic component could not be identified. The reason for this may be that only traditional statistical methods were applied, i.e. the untransformed components were used and the cosine-theta coefficient as similarity measure. During the last decade considerable progress in compositional data analysis was made and many case studies were published using new tools for exploratory analysis of these data. Therefore it makes sense to check if the application of suitable data transformations, reduction of the D-part simplex to two or three factors and visual interpretation of the factor scores would lead to a revision of earlier results and to answers to open questions . In this paper we follow the lines of a paper of R. Tolosana- Delgado et al. (2005) starting with a problem-oriented interpretation of the biplot scattergram, extracting compositional factors, ilr-transformation of the components and visualization of the factor scores in a spatial context: The compositional factors will be plotted versus depth (time) of the core samples in order to facilitate the identification of the expected sources of the sedimentary process. Kew words: compositional data analysis, biplot, deep sea sediments

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The quantitative estimation of Sea Surface Temperatures from fossils assemblages is a fundamental issue in palaeoclimatic and paleooceanographic investigations. The Modern Analogue Technique, a widely adopted method based on direct comparison of fossil assemblages with modern coretop samples, was revised with the aim of conforming it to compositional data analysis. The new CODAMAT method was developed by adopting the Aitchison metric as distance measure. Modern coretop datasets are characterised by a large amount of zeros. The zero replacement was carried out by adopting a Bayesian approach to the zero replacement, based on a posterior estimation of the parameter of the multinomial distribution. The number of modern analogues from which reconstructing the SST was determined by means of a multiple approach by considering the Proxies correlation matrix, Standardized Residual Sum of Squares and Mean Squared Distance. This new CODAMAT method was applied to the planktonic foraminiferal assemblages of a core recovered in the Tyrrhenian Sea. Kew words: Modern analogues, Aitchison distance, Proxies correlation matrix, Standardized Residual Sum of Squares

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Our essay aims at studying suitable statistical methods for the clustering of compositional data in situations where observations are constituted by trajectories of compositional data, that is, by sequences of composition measurements along a domain. Observed trajectories are known as “functional data” and several methods have been proposed for their analysis. In particular, methods for clustering functional data, known as Functional Cluster Analysis (FCA), have been applied by practitioners and scientists in many fields. To our knowledge, FCA techniques have not been extended to cope with the problem of clustering compositional data trajectories. In order to extend FCA techniques to the analysis of compositional data, FCA clustering techniques have to be adapted by using a suitable compositional algebra. The present work centres on the following question: given a sample of compositional data trajectories, how can we formulate a segmentation procedure giving homogeneous classes? To address this problem we follow the steps described below. First of all we adapt the well-known spline smoothing techniques in order to cope with the smoothing of compositional data trajectories. In fact, an observed curve can be thought of as the sum of a smooth part plus some noise due to measurement errors. Spline smoothing techniques are used to isolate the smooth part of the trajectory: clustering algorithms are then applied to these smooth curves. The second step consists in building suitable metrics for measuring the dissimilarity between trajectories: we propose a metric that accounts for difference in both shape and level, and a metric accounting for differences in shape only. A simulation study is performed in order to evaluate the proposed methodologies, using both hierarchical and partitional clustering algorithm. The quality of the obtained results is assessed by means of several indices

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Many multivariate methods that are apparently distinct can be linked by introducing one or more parameters in their definition. Methods that can be linked in this way are correspondence analysis, unweighted or weighted logratio analysis (the latter also known as "spectral mapping"), nonsymmetric correspondence analysis, principal component analysis (with and without logarithmic transformation of the data) and multidimensional scaling. In this presentation I will show how several of these methods, which are frequently used in compositional data analysis, may be linked through parametrizations such as power transformations, linear transformations and convex linear combinations. Since the methods of interest here all lead to visual maps of data, a "movie" can be made where where the linking parameter is allowed to vary in small steps: the results are recalculated "frame by frame" and one can see the smooth change from one method to another. Several of these "movies" will be shown, giving a deeper insight into the similarities and differences between these methods

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper we examine the problem of compositional data from a different starting point. Chemical compositional data, as used in provenance studies on archaeological materials, will be approached from the measurement theory. The results will show, in a very intuitive way that chemical data can only be treated by using the approach developed for compositional data. It will be shown that compositional data analysis is a particular case in projective geometry, when the projective coordinates are in the positive orthant, and they have the properties of logarithmic interval metrics. Moreover, it will be shown that this approach can be extended to a very large number of applications, including shape analysis. This will be exemplified with a case study in architecture of Early Christian churches dated back to the 5th-7th centuries AD

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Factor analysis as frequent technique for multivariate data inspection is widely used also for compositional data analysis. The usual way is to use a centered logratio (clr) transformation to obtain the random vector y of dimension D. The factor model is then y = Λf + e (1) with the factors f of dimension k < D, the error term e, and the loadings matrix Λ. Using the usual model assumptions (see, e.g., Basilevsky, 1994), the factor analysis model (1) can be written as Cov(y) = ΛΛT + ψ (2) where ψ = Cov(e) has a diagonal form. The diagonal elements of ψ as well as the loadings matrix Λ are estimated from an estimation of Cov(y). Given observed clr transformed data Y as realizations of the random vector y. Outliers or deviations from the idealized model assumptions of factor analysis can severely effect the parameter estimation. As a way out, robust estimation of the covariance matrix of Y will lead to robust estimates of Λ and ψ in (2), see Pison et al. (2003). Well known robust covariance estimators with good statistical properties, like the MCD or the S-estimators (see, e.g. Maronna et al., 2006), rely on a full-rank data matrix Y which is not the case for clr transformed data (see, e.g., Aitchison, 1986). The isometric logratio (ilr) transformation (Egozcue et al., 2003) solves this singularity problem. The data matrix Y is transformed to a matrix Z by using an orthonormal basis of lower dimension. Using the ilr transformed data, a robust covariance matrix C(Z) can be estimated. The result can be back-transformed to the clr space by C(Y ) = V C(Z)V T where the matrix V with orthonormal columns comes from the relation between the clr and the ilr transformation. Now the parameters in the model (2) can be estimated (Basilevsky, 1994) and the results have a direct interpretation since the links to the original variables are still preserved. The above procedure will be applied to data from geochemistry. Our special interest is on comparing the results with those of Reimann et al. (2002) for the Kola project data

Relevância:

100.00% 100.00%

Publicador:

Resumo:

With the increasing awareness of protein folding disorders, the explosion of genomic information, and the need for efficient ways to predict protein structure, protein folding and unfolding has become a central issue in molecular sciences research. Molecular dynamics computer simulations are increasingly employed to understand the folding and unfolding of proteins. Running protein unfolding simulations is computationally expensive and finding ways to enhance performance is a grid issue on its own. However, more and more groups run such simulations and generate a myriad of data, which raises new challenges in managing and analyzing these data. Because the vast range of proteins researchers want to study and simulate, the computational effort needed to generate data, the large data volumes involved, and the different types of analyses scientists need to perform, it is desirable to provide a public repository allowing researchers to pool and share protein unfolding data. This paper describes efforts to provide a grid-enabled data warehouse for protein unfolding data. We outline the challenge and present first results in the design and implementation of the data warehouse.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq)

Relevância:

100.00% 100.00%

Publicador:

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper is part of a special issue of Applied Geochemistry focusing on reliable applications of compositional multivariate statistical methods. This study outlines the application of compositional data analysis (CoDa) to calibration of geochemical data and multivariate statistical modelling of geochemistry and grain-size data from a set of Holocene sedimentary cores from the Ganges-Brahmaputra (G-B) delta. Over the last two decades, understanding near-continuous records of sedimentary sequences has required the use of core-scanning X-ray fluorescence (XRF) spectrometry, for both terrestrial and marine sedimentary sequences. Initial XRF data are generally unusable in ‘raw-format’, requiring data processing in order to remove instrument bias, as well as informed sequence interpretation. The applicability of these conventional calibration equations to core-scanning XRF data are further limited by the constraints posed by unknown measurement geometry and specimen homogeneity, as well as matrix effects. Log-ratio based calibration schemes have been developed and applied to clastic sedimentary sequences focusing mainly on energy dispersive-XRF (ED-XRF) core-scanning. This study has applied high resolution core-scanning XRF to Holocene sedimentary sequences from the tidal-dominated Indian Sundarbans, (Ganges-Brahmaputra delta plain). The Log-Ratio Calibration Equation (LRCE) was applied to a sub-set of core-scan and conventional ED-XRF data to quantify elemental composition. This provides a robust calibration scheme using reduced major axis regression of log-ratio transformed geochemical data. Through partial least squares (PLS) modelling of geochemical and grain-size data, it is possible to derive robust proxy information for the Sundarbans depositional environment. The application of these techniques to Holocene sedimentary data offers an improved methodological framework for unravelling Holocene sedimentation patterns.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Isotopic data are currently becoming an important source of information regarding sources, evolution and mixing processes of water in hydrogeologic systems. However, it is not clear how to treat with statistics the geochemical data and the isotopic data together. We propose to introduce the isotopic information as new parts, and apply compositional data analysis with the resulting increased composition. Results are equivalent to downscale the classical isotopic delta variables, because they are already relative (as needed in the compositional framework) and isotopic variations are almost always very small. This methodology is illustrated and tested with the study of the Llobregat River Basin (Barcelona, NE Spain), where it is shown that, though very small, isotopic variations comp lement geochemical principal components, and help in the better identification of pollution sources

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Numerous studies have reported links between insulin-like growth factors (IGFs) and the extra-cellular matrix protein vitronectin (VN). We ourselves have reported that IGF-I binds to VN via IGF-binding proteins (IGFBPs) to stimulate HaCaT and MCF-7 cell migration. Here, we detail the functional evaluation of IGFBP-1, -2, -3, -4 and -6 in the presence and absence of IGF-I and VN. The data presented here, combined with our prior data on IGFBP-5, suggest that IGFBP-3, -4 and -5 are the most effective at stimulating cell migration in combination with IGF-I and VN. In addition, we demonstrate that different regions within IGFBP-3 and -4 are critical for complex formation. Furthermore, we examine whether multi-protein complexes of IGF-I and IGFBPs associated with fibronectin and collagen IV are also able to enhance functional biological responses.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A number of reports have demonstrated the importance of the CUB domaincontaining protein 1 (CDCP1) in facilitating cancer progression in animal models and the potential of this protein as a prognostic marker in several malignancies. CDCP1 facilitates metastasis formation in animal models by negatively regulating anoikis, a type of apoptosis triggered by the loss of attachment signalling from cell-cell contacts or cell-extra cellular matrix (ECM) contacts. Due to the important role CDCP1 plays in cancer progression in model systems, it is considered a potential drug target to prevent the metastatic spread of cancers. CDCP1 is a highly glycosylated 836 amino acid cell surface protein. It has structural features potentially facilitating protein-protein interactions including 14 N-glycosylation sites, three CUB-like domains, 20 cysteine residues likely to be involved in disulfide bond formation and five intracellular tyrosine residues. CDCP1 interacts with a variety of proteins including Src family kinases (SFKs) and protein kinase C ä (PKCä). Efforts to understand the mechanisms regulating these interactions have largely focussed on three CDCP1 tyrosine residues Y734, Y743 and Y762. CDCP1-Y734 is the site where SFKs phosphorylate and bind to CDCP1 and mediate subsequent phosphorylation of CDCP1-Y743 and -Y762 which leads to binding of PKCä at CDCP1-Y762. The resulting trimeric protein complex of SFK•CDCP1•PKCä has been proposed to mediate an anti-apoptotic cell phenotype in vitro, and to promote metastasis in vivo. The effect of mutation of the three tyrosines on interactions of CDCP1 with SFKs and PKCä and the consequences on cell phenotype in vitro and in vivo have not been examined. CDCP1 has a predicted molecular weight of ~90 kDa but is usually detected as a protein which migrates at ~135 kDa by Western blot analysis due to its high degree of glycosylation. A low molecular weight form of CDCP1 (LMWCDCP1) of ~70 kDa has been found in a variety of cancer cell lines. The mechanisms leading to the generation of LMW-CDCP1 in vivo are not well understood but an involvement of proteases in this process has been proposed. Serine proteases including plasmin and trypsin are able to proteolytically process CDCP1. In addition, the recombinant protease domain of the serine protease matriptase is also able to cleave the recombinant extracellular portion of CDCP1. Whether matriptase is able to proteolytically process CDCP1 on the cell surface has not been examined. Importantly, proteolytic processing of CDCP1 by trypsin leads to phosphorylation of its cell surface-retained portion which suggests that this event leads to initiation of an intracellular signalling cascade. This project aimed to further examine the biology of CDCP1 with a main of focus on exploring the roles played by CDCP1 tyrosine residues. To achieve this HeLa cells stably expressing CDCP1 or the CDCP1 tyrosine mutants Y734F, Y743F and Y762F were generated. These cell lines were used to examine: • The roles of the tyrosine residues Y734, Y743 and Y762 in mediating interactions of CDCP1 with binding proteins and to examine the effect of the stable expression on HeLa cell morphology. • The ability of the serine protease matriptase to proteolytically process cell surface CDCP1 and to examine the consequences of this event on HeLa cell phenotype and cell signalling in vitro. • The importance of these residues in processes associated with cancer progression in vitro including adhesion, proliferation and migration. • The role of these residues on metastatic phenotype in vivo and the ability of a function-blocking anti-CDCP1 antibody to inhibit metastasis in the chicken embryo chorioallantoic membrane (CAM) assay. Interestingly, biochemical experiments carried out in this study revealed that mutation of certain CDCP1 tyrosine residues impacts on interactions of this protein with binding proteins. For example, binding of SFKs as well as PKCä to CDCP1 was markedly decreased in HeLa-CDCP1-Y734F cells, and binding of PKCä was also reduced in HeLa-CDCP1-Y762F cells. In contrast, HeLa-CDCP1-Y743F cells did not display altered interactions with CDCP1 binding proteins. Importantly, observed differences in interactions of CDCP1 with binding partners impacted on basal phosphorylation of CDCP1. It was found that HeLa-CDCP1, HeLa-CDCP1-Y743F and -Y762F displayed strong basal levels of CDCP1 phosphorylation. In contrast, HeLa-CDCP1-Y734F cells did not display CDCP1 phosphorylation but exhibited constitutive phosphorylation of focal adhesion kinase (FAK) at tyrosine 861. Significantly, subsequent investigations to examine this observation suggested that CDCP1-Y734 and FAK-Y861 are competitive substrates for SFK-mediated phosphorylation. It appeared that SFK-mediated phosphorylation of CDCP1- Y734 and FAK-Y861 is an equilibrium which shifts depending on the level of CDCP1 expression in HeLa cells. This suggests that the level of CDCP1 expression may act as a regulatory mechanism allowing cells to switch from a FAK-Y861 mediated pathway to a CDCP1-Y734 mediated pathway. This is the first time that a link between SFKs, CDCP1 and FAK has been demonstrated. One of the most interesting observations from this work was that CDCP1 altered HeLa cell morphology causing an elongated and fibroblastic-like appearance. Importantly, this morphological change depended on CDCP1- Y734. In addition, it was observed that this change in cell morphology was accompanied by increased phosphorylation of SFK-Y416. This suggests that interactions of SFKs with CDCP1-Y734 increases SFK activity since SFKY416 is critical in regulating kinase activity of these proteins. The essential role of SFKs in mediating CDCP1-induced HeLa cell morphological changes was demonstrated using the SFK-selective inhibitor SU6656. This inhibitor caused reversion of HeLa-CDCP1 cell morphology to an epithelial appearance characteristic of HeLa-vector cells. Significantly, in vitro studies revealed that certain CDCP1-mediated cell phenotypes are mediated by cellular pathways dependent on CDCP1 tyrosine residues whereas others are independent of these sites. For example, CDCP1 expression caused a marked increase in HeLa cell motility that was independent of CDCP1 tyrosine residues. In contrast, CDCP1- induced decrease in HeLa cell proliferation was most prominent in HeLa- CDCP1-Y762F cells, potentially indicating a role for this site in regulating proliferation in HeLa cells. Another cellular event which was identified to require phosphorylation of a particular CDCP1 tyrosine residue is adhesion to fibronectin. It was observed that the CDCP1-mediated strong decrease in adhesion to fibronectin is mostly restored in HeLa-CDCP1-Y743F cells. This suggests a possible role for CDCP1-Y743 in causing a CDCP1-mediated decrease in adhesion. Data from in vivo experiments indicated that HeLa-CDCP1-Y734F cells are more metastic than HeLa-CDCP1 cells in vivo. This indicates that interaction of CDCP1 with SFKs and PKCä may not be required for CDCP1-mediated metastasis formation of HeLa cells in vivo. The metastatic phenotype of these cells may be caused by signalling involving FAK since HeLa-CDCP1- Y734F cells are the only CDCP1 expressing cells displaying constitutive phosphorylation of FAK-Y861. HeLa-CDCP1-Y762F cells displayed a very low metastatic ability which suggests that this CDCP1 tyrosine residue is important in mediating a pro-metastatic phenotype in HeLa cells. More detailed exploration of cellular events occurring downstream of CDCP1-Y734 and -Y762 may provide important insights into the mechanisms altering the metastatic ability of CDCP1 expressing HeLa cells. Complementing the in vivo studies, anti-CDCP1 antibodies were employed to assess whether these antibodies are able to inhibit metastasis of CDCP1 and CDCP1 tyrosine mutants expressing HeLa cells. It was found that HeLa- CDCP1-Y734F cells were the only cell line which was markedly reduced in the ability to metastasise. In contrast, the ability of HeLa-CDCP1, HeLa- CDCP1-Y743F and -Y762F cells to metastasise in vivo was not inhibited. These data suggest a possible role of interactions of CDCP1 with SFKs, occurring at CDCP1-Y734, in preventing an anti-metastatic effect of anti- CDCP1 antibodies in vivo. The proposal that SFKs may play a role in regulating anti-metastatic effects of anti-CDCP1 antibodies was supported by another experiment where differences between HeLa-CDCP1 cells and CDCP1 expressing HeLa cells (HeLa-CDCP1-S) from collaborators at the Scripps Research Institute were examined. It was found that HeLa-CDCP1-S cells express different SFKs than CDCP1 expressing HeLa cells generated for this study. This is important since HeLa-CDCP1-S cells can be inhibited in their metastatic ability using anti-CDCP1 antibodies in vivo. Importantly, these data suggest that further examinations of the roles of SFKs in facilitating anti-metastatic effects of anti-CDCP1 antibodies may give insights into how CDCP1 can be blocked to prevent metastasis in vivo. This project also explored the ability of the serine protease matriptase to proteolytically process cell surface localised CDCP1 because it is unknown whether matriptase can cleave cell surface CDCP1 as it has been reported for other proteases such as trypsin and plasmin. Furthermore, the consequences of matriptase-mediated proteolysis on cell phenotype in vitro and cell signalling were examined since recent reports suggested that proteolysis of CDCP1 leads to its phosphorylation and may initiate cell signalling and consequently alter cell phenotype. It was found that matriptase is able to proteolytically process cell surface CDCP1 at low nanomolar concentrations which suggests that cleavage of CDCP1 by matriptase may facilitate the generation of LWM-CDCP1 in vivo. To examine whether matriptase-mediated proteolysis induced cell signalling anti-phospho Erk 1/2 Western blot analysis was performed as this pathway has previously been examined to study signalling in response to proteolytic processing of cell surface proteins. It was found that matriptase-mediated proteolysis in CDCP1 expressing HeLa cells initiated intracellular signalling via Erk 1/2. Interestingly, this increase in phosphorylation of Erk 1/2 was also observed in HeLa-vector cells. This suggested that initiation of cell signalling via Erk 1/2 phosphorylation as a result of matriptase-mediated proteolysis occurs by pathways independent of CDCP1. Subsequent investigations measuring the flux of free calcium ions and by using a protease-activated receptor 2 (PAR2) agonist peptide confirmed this hypothesis. These data suggested that matriptase-mediated proteolysis results in cell signalling via a pathway induced by the activation of PAR2 rather than by CDCP1. This indicates that induction of cell signalling in HeLa cells as a consequence of matriptase-mediated proteolysis occurs via signalling pathways which do not involve phosphorylation of Erk 1/2. Consequently, it appears that future attempts should focus on the examination of cellular pathways other than Erk 1/2 to elucidate cell signalling initiated by matriptase-mediated proteolytic processing of CDCP1. The data presented in this thesis has explored in vitro and in vivo aspects of the biology of CDCP1. The observations summarised above will permit the design of future studies to more precisely determine the role of CDCP1 and its binding partners in processes relevant to cancer progression. This may contribute to further defining CDCP1 as a target for cancer treatment.