905 resultados para Replicated Microarray Experiments
Resumo:
MOTIVATION: Microarray results accumulated in public repositories are widely reused in meta-analytical studies and secondary databases. The quality of the data obtained with this technology varies from experiment to experiment, and an efficient method for quality assessment is necessary to ensure their reliability. RESULTS: The lack of a good benchmark has hampered evaluation of existing methods for quality control. In this study, we propose a new independent quality metric that is based on evolutionary conservation of expression profiles. We show, using 11 large organ-specific datasets, that IQRray, a new quality metrics developed by us, exhibits the highest correlation with this reference metric, among 14 metrics tested. IQRray outperforms other methods in identification of poor quality arrays in datasets composed of arrays from many independent experiments. In contrast, the performance of methods designed for detecting outliers in a single experiment like Normalized Unscaled Standard Error and Relative Log Expression was low because of the inability of these methods to detect datasets containing only low-quality arrays and because the scores cannot be directly compared between experiments. AVAILABILITY AND IMPLEMENTATION: The R implementation of IQRray is available at: ftp://lausanne.isb-sib.ch/pub/databases/Bgee/general/IQRray.R. CONTACT: Marta.Rosikiewicz@unil.ch SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Resumo:
Gene set enrichment (GSE) analysis is a popular framework for condensing information from gene expression profiles into a pathway or signature summary. The strengths of this approach over single gene analysis include noise and dimension reduction, as well as greater biological interpretability. As molecular profiling experiments move beyond simple case-control studies, robust and flexible GSE methodologies are needed that can model pathway activity within highly heterogeneous data sets. To address this challenge, we introduce Gene Set Variation Analysis (GSVA), a GSE method that estimates variation of pathway activity over a sample population in an unsupervised manner. We demonstrate the robustness of GSVA in a comparison with current state of the art sample-wise enrichment methods. Further, we provide examples of its utility in differential pathway activity and survival analysis. Lastly, we show how GSVA works analogously with data from both microarray and RNA-seq experiments. GSVA provides increased power to detect subtle pathway activity changes over a sample population in comparison to corresponding methods. While GSE methods are generally regarded as end points of a bioinformatic analysis, GSVA constitutes a starting point to build pathway-centric models of biology. Moreover, GSVA contributes to the current need of GSE methods for RNA-seq data. GSVA is an open source software package for R which forms part of the Bioconductor project and can be downloaded at http://www.bioconductor.org.
Resumo:
Currently, numerous high-throughput technologies are available for the study of human carcinomas. In literature, many variations of these techniques have been described. The common denominator for these methodologies is the high amount of data obtained in a single experiment, in a short time period, and at a fairly low cost. However, these methods have also been described with several problems and limitations. The purpose of this study was to test the applicability of two selected high-throughput methods, cDNA and tissue microarrays (TMA), in cancer research. Two common human malignancies, breast and colorectal cancer, were used as examples. This thesis aims to present some practical considerations that need to be addressed when applying these techniques. cDNA microarrays were applied to screen aberrant gene expression in breast and colon cancers. Immunohistochemistry was used to validate the results and to evaluate the association of selected novel tumour markers with the outcome of the patients. The type of histological material used in immunohistochemistry was evaluated especially considering the applicability of whole tissue sections and different types of TMAs. Special attention was put on the methodological details in the cDNA microarray and TMA experiments. In conclusion, many potential tumour markers were identified in the cDNA microarray analyses. Immunohistochemistry could be applied to validate the observed gene expression changes of selected markers and to associate their expression change with patient outcome. In the current experiments, both TMAs and whole tissue sections could be used for this purpose. This study showed for the first time that securin and p120 catenin protein expression predict breast cancer outcome and the immunopositivity of carbonic anhydrase IX associates with the outcome of rectal cancer. The predictive value of these proteins was statistically evident also in multivariate analyses with up to a 13.1- fold risk for cancer specific death in a specific subgroup of patients.
Resumo:
High-throughput screening of cellular effects of RNA interference (RNAi) libraries is now being increasingly applied to explore the role of genes in specific cell biological processes and disease states. However, the technology is still limited to specialty laboratories, due to the requirements for robotic infrastructure, access to expensive reagent libraries, expertise in high-throughput screening assay development, standardization, data analysis and applications. In the future, alternative screening platforms will be required to expand functional large-scale experiments to include more RNAi constructs, allow combinatorial loss-of-function analyses (e.g. genegene or gene-drug interaction), gain-of-function screens, multi-parametric phenotypic readouts or comparative analysis of many different cell types. Such comprehensive perturbation of gene networks in cells will require a major increase in the flexibility of the screening platforms, throughput and reduction of costs. As an alternative for the conventional multi-well based high-throughput screening -platforms, here the development of a novel cell spot microarray method for production of high density siRNA reverse transfection arrays is described. The cell spot microarray platform is distinguished from the majority of other transfection cell microarray techniques by the spatially confined array layout that allow highly parallel screening of large-scale RNAi reagent libraries with assays otherwise difficult or not applicable to high-throughput screening. This study depicts the development of the cell spot microarray method along with biological application examples of high-content immunofluorescence and phenotype based cancer cell biological analyses focusing on the regulation of prostate cancer cell growth, maintenance of genomic integrity in breast cancer cells, and functional analysis of integrin protein-protein interactions in situ.
Resumo:
Presented herein is an experimental design that allows the effects of several radiative forcing factors on climate to be estimated as precisely as possible from a limited suite of atmosphere-only general circulation model (GCM) integrations. The forcings include the combined effect of observed changes in sea surface temperatures, sea ice extent, stratospheric (volcanic) aerosols, and solar output, plus the individual effects of several anthropogenic forcings. A single linear statistical model is used to estimate the forcing effects, each of which is represented by its global mean radiative forcing. The strong colinearity in time between the various anthropogenic forcings provides a technical problem that is overcome through the design of the experiment. This design uses every combination of anthropogenic forcing rather than having a few highly replicated ensembles, which is more commonly used in climate studies. Not only is this design highly efficient for a given number of integrations, but it also allows the estimation of (nonadditive) interactions between pairs of anthropogenic forcings. The simulated land surface air temperature changes since 1871 have been analyzed. The changes in natural and oceanic forcing, which itself contains some forcing from anthropogenic and natural influences, have the most influence. For the global mean, increasing greenhouse gases and the indirect aerosol effect had the largest anthropogenic effects. It was also found that an interaction between these two anthropogenic effects in the atmosphere-only GCM exists. This interaction is similar in magnitude to the individual effects of changing tropospheric and stratospheric ozone concentrations or to the direct (sulfate) aerosol effect. Various diagnostics are used to evaluate the fit of the statistical model. For the global mean, this shows that the land temperature response is proportional to the global mean radiative forcing, reinforcing the use of radiative forcing as a measure of climate change. The diagnostic tests also show that the linear model was suitable for analyses of land surface air temperature at each GCM grid point. Therefore, the linear model provides precise estimates of the space time signals for all forcing factors under consideration. For simulated 50-hPa temperatures, results show that tropospheric ozone increases have contributed to stratospheric cooling over the twentieth century almost as much as changes in well-mixed greenhouse gases.
Resumo:
Background: Microarray based comparative genomic hybridisation (CGH) experiments have been used to study numerous biological problems including understanding genome plasticity in pathogenic bacteria. Typically such experiments produce large data sets that are difficult for biologists to handle. Although there are some programmes available for interpretation of bacterial transcriptomics data and CGH microarray data for looking at genetic stability in oncogenes, there are none specifically to understand the mosaic nature of bacterial genomes. Consequently a bottle neck still persists in accurate processing and mathematical analysis of these data. To address this shortfall we have produced a simple and robust CGH microarray data analysis process that may be automated in the future to understand bacterial genomic diversity. Results: The process involves five steps: cleaning, normalisation, estimating gene presence and absence or divergence, validation, and analysis of data from test against three reference strains simultaneously. Each stage of the process is described and we have compared a number of methods available for characterising bacterial genomic diversity, for calculating the cut-off between gene presence and absence or divergence, and shown that a simple dynamic approach using a kernel density estimator performed better than both established, as well as a more sophisticated mixture modelling technique. We have also shown that current methods commonly used for CGH microarray analysis in tumour and cancer cell lines are not appropriate for analysing our data. Conclusion: After carrying out the analysis and validation for three sequenced Escherichia coli strains, CGH microarray data from 19 E. coli O157 pathogenic test strains were used to demonstrate the benefits of applying this simple and robust process to CGH microarray studies using bacterial genomes.
Resumo:
Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)
Resumo:
Adhesion, immune evasion and invasion are key determinants during bacterial pathogenesis. Pathogenic bacteria possess a wide variety of surface exposed and secreted proteins which allow them to adhere to tissues, escape the immune system and spread throughout the human body. Therefore, extensive contacts between the human and the bacterial extracellular proteomes take place at the host-pathogen interface at the protein level. Recent researches emphasized the importance of a global and deeper understanding of the molecular mechanisms which underlie bacterial immune evasion and pathogenesis. Through the use of a large-scale, unbiased, protein microarray-based approach and of wide libraries of human and bacterial purified proteins, novel host-pathogen interactions were identified. This approach was first applied to Staphylococcus aureus, cause of a wide variety of diseases ranging from skin infections to endocarditis and sepsis. The screening led to the identification of several novel interactions between the human and the S. aureus extracellular proteomes. The interaction between the S. aureus immune evasion protein FLIPr (formyl-peptide receptor like-1 inhibitory protein) and the human complement component C1q, key players of the offense-defense fighting, was characterized using label-free techniques and functional assays. The same approach was also applied to Neisseria meningitidis, major cause of bacterial meningitis and fulminant sepsis worldwide. The screening led to the identification of several potential human receptors for the neisserial adhesin A (NadA), an important adhesion protein and key determinant of meningococcal interactions with the human host at various stages. The interaction between NadA and human LOX-1 (low-density oxidized lipoprotein receptor) was confirmed using label-free technologies and cell binding experiments in vitro. Taken together, these two examples provided concrete insights into S. aureus and N. meningitidis pathogenesis, and identified protein microarray coupled with appropriate validation methodologies as a powerful large scale tool for host-pathogen interactions studies.
Resumo:
Nitrogen and water are essential for plant growth and development. In this study, we designed experiments to produce gene expression data of poplar roots under nitrogen starvation and water deprivation conditions. We found low concentration of nitrogen led first to increased root elongation followed by lateral root proliferation and eventually increased root biomass. To identify genes regulating root growth and development under nitrogen starvation and water deprivation, we designed a series of data analysis procedures, through which, we have successfully identified biologically important genes. Differentially Expressed Genes (DEGs) analysis identified the genes that are differentially expressed under nitrogen starvation or drought. Protein domain enrichment analysis identified enriched themes (in same domains) that are highly interactive during the treatment. Gene Ontology (GO) enrichment analysis allowed us to identify biological process changed during nitrogen starvation. Based on the above analyses, we examined the local Gene Regulatory Network (GRN) and identified a number of transcription factors. After testing, one of them is a high hierarchically ranked transcription factor that affects root growth under nitrogen starvation. It is very tedious and time-consuming to analyze gene expression data. To avoid doing analysis manually, we attempt to automate a computational pipeline that now can be used for identification of DEGs and protein domain analysis in a single run. It is implemented in scripts of Perl and R.
Resumo:
Biomarker research relies on tissue microarrays (TMA). TMAs are produced by repeated transfer of small tissue cores from a 'donor' block into a 'recipient' block and then used for a variety of biomarker applications. The construction of conventional TMAs is labor intensive, imprecise, and time-consuming. Here, a protocol using next-generation Tissue Microarrays (ngTMA) is outlined. ngTMA is based on TMA planning and design, digital pathology, and automated tissue microarraying. The protocol is illustrated using an example of 134 metastatic colorectal cancer patients. Histological, statistical and logistical aspects are considered, such as the tissue type, specific histological regions, and cell types for inclusion in the TMA, the number of tissue spots, sample size, statistical analysis, and number of TMA copies. Histological slides for each patient are scanned and uploaded onto a web-based digital platform. There, they are viewed and annotated (marked) using a 0.6-2.0 mm diameter tool, multiple times using various colors to distinguish tissue areas. Donor blocks and 12 'recipient' blocks are loaded into the instrument. Digital slides are retrieved and matched to donor block images. Repeated arraying of annotated regions is automatically performed resulting in an ngTMA. In this example, six ngTMAs are planned containing six different tissue types/histological zones. Two copies of the ngTMAs are desired. Three to four slides for each patient are scanned; 3 scan runs are necessary and performed overnight. All slides are annotated; different colors are used to represent the different tissues/zones, namely tumor center, invasion front, tumor/stroma, lymph node metastases, liver metastases, and normal tissue. 17 annotations/case are made; time for annotation is 2-3 min/case. 12 ngTMAs are produced containing 4,556 spots. Arraying time is 15-20 hr. Due to its precision, flexibility and speed, ngTMA is a powerful tool to further improve the quality of TMAs used in clinical and translational research.
Resumo:
The ongoing oceanic uptake of anthropogenic carbon dioxide (CO2) is significantly altering the carbonate chemistry of seawater, a phenomenon referred to as ocean acidification. Experimental manipulations have been increasingly used to gauge how continued ocean acidification will potentially impact marine ecosystems and their associated biogeochemical cycles in the future; however, results amongst studies, particularly when performed on natural communities, are highly variable, which may reflect community/environment-specific responses or inconsistencies in experimental approach. To investigate the potential for identification of more generic responses and greater experimentally reproducibility, we devised and implemented a series (n = 8) of short-term (2-4 days) multi-level (>=4 conditions) carbonate chemistry/nutrient manipulation experiments on a range of natural microbial communities sampled in Northwest European shelf seas. Carbonate chemistry manipulations and resulting biological responses were found to be highly reproducible within individual experiments and to a lesser extent between geographically separated experiments. Statistically robust reproducible physiological responses of phytoplankton to increasing pCO2, characterised by a suppression of net growth for small-sized cells (<10 µm), were observed in the majority of the experiments, irrespective of natural or manipulated nutrient status. Remaining between-experiment variability was potentially linked to initial community structure and/or other site-specific environmental factors. Analysis of carbon cycling within the experiments revealed the expected increased sensitivity of carbonate chemistry to biological processes at higher pCO2 and hence lower buffer capacity. The results thus emphasise how biogeochemical feedbacks may be altered in the future ocean.
Resumo:
Nitrous oxide emissions from a network of agricultural experiments in Europe were used to explore the relative importance of site and management controls of emissions. At each site, a selection of management interventions were compared within replicated experimental designs in plot-based experiments. Arable experiments were conducted at Beano in Italy, El Encin in Spain, Foulum in Denmark, Logarden in Sweden, Maulde in Belgium CE1, Paulinenaue in Germany, and Tulloch in the UK. Grassland experiments were conducted at Crichton, Nafferton and Peaknaze in the UK, Godollo in Hungary, Rzecin in Poland, Zarnekow in Germany and Theix in France. Nitrous oxide emissions were measured at each site over a period of at least two years using static chambers. Emissions varied widely between sites and as a result of manipulation treatments. Average site emissions (throughout the study period) varied between 0.04 and 21.21 kg N2O-N ha−1yr−1, with the largest fluxes and variability associated with the grassland sites. Total nitrogen addition was found to be the single most important deter- minant of emissions, accounting for 15 % of the variance (using linear regression) in the data from the arable sites (p<0.0001), and 77 % in the grassland sites. The annual emissions from arable sites were significantly greater than those that would be predicted by IPCC default emission fac- tors. Variability of N2O emissions within sites that occurred as a result of manipulation treatments was greater than that resulting from site-to-site and year-to-year variation, highlighting the importance of management interventions in contributing to greenhouse gas mitigation
Resumo:
We sought to create a comprehensive catalog of yeast genes whose transcript levels vary periodically within the cell cycle. To this end, we used DNA microarrays and samples from yeast cultures synchronized by three independent methods: α factor arrest, elutriation, and arrest of a cdc15 temperature-sensitive mutant. Using periodicity and correlation algorithms, we identified 800 genes that meet an objective minimum criterion for cell cycle regulation. In separate experiments, designed to examine the effects of inducing either the G1 cyclin Cln3p or the B-type cyclin Clb2p, we found that the mRNA levels of more than half of these 800 genes respond to one or both of these cyclins. Furthermore, we analyzed our set of cell cycle–regulated genes for known and new promoter elements and show that several known elements (or variations thereof) contain information predictive of cell cycle regulation. A full description and complete data sets are available at http://cellcycle-www.stanford.edu
Resumo:
We introduce a method of functionally classifying genes by using gene expression data from DNA microarray hybridization experiments. The method is based on the theory of support vector machines (SVMs). SVMs are considered a supervised computer learning method because they exploit prior knowledge of gene function to identify unknown genes of similar function from expression data. SVMs avoid several problems associated with unsupervised clustering methods, such as hierarchical clustering and self-organizing maps. SVMs have many mathematical features that make them attractive for gene expression analysis, including their flexibility in choosing a similarity function, sparseness of solution when dealing with large data sets, the ability to handle large feature spaces, and the ability to identify outliers. We test several SVMs that use different similarity metrics, as well as some other supervised learning methods, and find that the SVMs best identify sets of genes with a common function using expression data. Finally, we use SVMs to predict functional roles for uncharacterized yeast ORFs based on their expression data.
Resumo:
Bacterial pathogens manipulate host cells to promote pathogen survival and dissemination. We used a 22,571 human cDNA microarray to identify host pathways that are affected by the Salmonella enterica subspecies typhimurium phoP gene, a transcription factor required for virulence, by comparing the expression profiles of human monocytic tissue culture cells infected with either the wild-type bacteria or a phoP∷Tn10 mutant strain. Both wild-type and phoP∷Tn10 bacteria induced a common set of genes, many of which are proinflammatory. Differentially expressed genes included those that affect host cell death, suggesting that the phoP regulatory system controls bacterial genes that alter macrophage survival. Subsequent experiments showed that the phoP∷Tn10 mutant strain is defective for killing both cultured and primary human macrophages but is able to replicate intracellularly. These experiments indicate that phoP plays a role in Salmonella-induced human macrophage cell death.