91 resultados para GENE SET ANALYSIS


Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this article, we focus on the analysis of competitive gene set methods for detecting the statistical significance of pathways from gene expression data. Our main result is to demonstrate that some of the most frequently used gene set methods, GSEA, GSEArot and GAGE, are severely influenced by the filtering of the data in a way that such an analysis is no longer reconcilable with the principles of statistical inference, rendering the obtained results in the worst case inexpressive. A possible consequence of this is that these methods can increase their power by the addition of unrelated data and noise. Our results are obtained within a bootstrapping framework that allows a rigorous assessment of the robustness of results and enables power estimates. Our results indicate that when using competitive gene set methods, it is imperative to apply a stringent gene filtering criterion. However, even when genes are filtered appropriately, for gene expression data from chips that do not provide a genome-scale coverage of the expression values of all mRNAs, this is not enough for GSEA, GSEArot and GAGE to ensure the statistical soundness of the applied procedure. For this reason, for biomedical and clinical studies, we strongly advice not to use GSEA, GSEArot and GAGE for such data sets.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Motivation: To date, Gene Set Analysis (GSA) approaches primarily focus on identifying differentially expressed gene sets (pathways). Methods for identifying differentially coexpressed pathways also exist but are mostly based on aggregated pairwise correlations, or other pairwise measures of coexpression. Instead, we propose Gene Sets Net Correlations Analysis (GSNCA), a multivariate differential coexpression test that accounts for the complete correlation structure between genes.

Results: In GSNCA, weight factors are assigned to genes in proportion to the genes' cross-correlations (intergene correlations). The problem of finding the weight vectors is formulated as an eigenvector problem with a unique solution. GSNCA tests the null hypothesis that for a gene set there is no difference in the weight vectors of the genes between two conditions. In simulation studies and the analyses of experimental data, we demonstrate that GSNCA, indeed, captures changes in the structure of genes' cross-correlations rather than differences in the averaged pairwise correlations. Thus, GSNCA infers differences in coexpression networks, however, bypassing method-dependent steps of network inference. As an additional result from GSNCA, we define hub genes as genes with the largest weights and show that these genes correspond frequently to major and specific pathway regulators, as well as to genes that are most affected by the biological difference between two conditions. In summary, GSNCA is a new approach for the analysis of differentially coexpressed pathways that also evaluates the importance of the genes in the pathways, thus providing unique information that may result in the generation of novel biological hypotheses.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Empirically derived phenotypic measurements have the potential to enhance gene-finding efforts in schizophrenia. Previous research based on factor analyses of symptoms has typically included schizoaffective cases. Deriving factor loadings from analysis of only narrowly defined schizophrenia cases could yield more sensitive factor scores for gene pathway and gene ontology analyses. Using an Irish family sample, this study 1) factor analyzed clinician-rated Operational Criteria Checklist items in cases with schizophrenia only, 2) scored the full sample based on these factor loadings, and 3) implemented genome-wide association, gene-based, and gene-pathway analysis of these SCZ-based symptom factors (final N= 507). Three factors emerged from the analysis of the schizophrenia cases: a manic, a depressive, and a positive symptom factor. In gene-based analyses of these factors, multiple genes had q<. 0.01. Of particular interest are findings for PTPRG and WBP1L, both of which were previously implicated by the Psychiatric Genomics Consortium study of SCZ; results from this study suggest that variants in these genes might also act as modifiers of SCZ symptoms. Gene pathway analyses of the first factor indicated over-representation of glutamatergic transmission, GABA-A receptor, and cyclic GMP pathways. Results suggest that these pathways may have differential influence on affective symptom presentation in schizophrenia.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

PURPOSE: IGFBP7 belongs to a family of insulin-like growth factor-1 regulatory binding proteins. IGFBP7 hypermethylation is associated with its down-regulation in various carcinomas. In prostate cancer IGFBP7 down-regulation has been widely reported but to our knowledge the mechanisms behind this event are unknown. We performed a denaturing high performance liquid chromatography screening and validation strategy to profile the methylation status of IGFBP7 in prostate cancer.

MATERIALS AND METHODS: We combined denaturing high performance liquid chromatography and bisulfite sequencing to examine IGFBP7 methylation in a panel of prostate cancer cell lines. Quantitative methylation specific polymerase chain reaction was used to determine methylation levels in prostate tissue specimens of primary prostate cancer, histologically benign prostate adjacent to tumor, high grade prostatic intraepithelial neoplasia and benign prostatic hyperplasia. IGFBP7 gene expression was measured by quantitative methylation specific polymerase chain reaction in cell lines and tissue specimens.

RESULTS: IGFBP7 was methylated in the 4 prostate cancer cell lines DU145, LNCaP, PC-3 and 22RV1. Quantitative methylation specific polymerase chain reaction analysis revealed that promoter methylation was associated with decreased IGFBP7 expression. Quantitative methylation specific polymerase chain reaction showed that IGFBP7 methylation was more frequently detected in prostate cancer (60% (31/52)) and high grade prostatic intraepithelial neoplasia (40% (6/15)) samples compared to histologically benign prostate adjacent to tumor (10%) and benign prostatic hyperplasia (0%) samples.

CONCLUSIONS: To our knowledge this is the first report of aberrant IGFBP7 promoter hypermethylation and concurrent IGFBP7 gene silencing in prostate cancer cell lines. Results demonstrate that CpG methylation of IGFBP7 may represent a novel biomarker of prostate cancer and pre-invasive neoplasms. Thus, future examination of IGFBP7 methylation and expression in a larger patient cohort, including bodily fluids, is justified to further evaluate its role in a diagnostic and prognostic setting.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

One of the major challenges in systems biology is to understand the complex responses of a biological system to external perturbations or internal signalling depending on its biological conditions. Genome-wide transcriptomic profiling of cellular systems under various chemical perturbations allows the manifestation of certain features of the chemicals through their transcriptomic expression profiles. The insights obtained may help to establish the connections between human diseases, associated genes and therapeutic drugs. The main objective of this study was to systematically analyse cellular gene expression data under various drug treatments to elucidate drug-feature specific transcriptomic signatures. We first extracted drug-related information (drug features) from the collected textual description of DrugBank entries using text-mining techniques. A novel statistical method employing orthogonal least square learning was proposed to obtain drug-feature-specific signatures by integrating gene expression with DrugBank data. To obtain robust signatures from noisy input datasets, a stringent ensemble approach was applied with the combination of three techniques: resampling, leave-one-out cross validation, and aggregation. The validation experiments showed that the proposed method has the capacity of extracting biologically meaningful drug-feature-specific gene expression signatures. It was also shown that most of signature genes are connected with common hub genes by regulatory network analysis. The common hub genes were further shown to be related to general drug metabolism by Gene Ontology analysis. Each set of genes has relatively few interactions with other sets, indicating the modular nature of each signature and its drug-feature-specificity. Based on Gene Ontology analysis, we also found that each set of drug feature (DF)-specific genes were indeed enriched in biological processes related to the drug feature. The results of these experiments demonstrated the pot- ntial of the method for predicting certain features of new drugs using their transcriptomic profiles, providing a useful methodological framework and a valuable resource for drug development and characterization.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Motivation: Recently, many univariate and several multivariate approaches have been suggested for testing differential expression of gene sets between different phenotypes. However, despite a wealth of literature studying their performance on simulated and real biological data, still there is a need to quantify their relative performance when they are testing different null hypotheses.

Results: In this article, we compare the performance of univariate and multivariate tests on both simulated and biological data. In the simulation study we demonstrate that high correlations equally affect the power of both, univariate as well as multivariate tests. In addition, for most of them the power is similarly affected by the dimensionality of the gene set and by the percentage of genes in the set, for which expression is changing between two phenotypes. The application of different test statistics to biological data reveals that three statistics (sum of squared t-tests, Hotelling's T2, N-statistic), testing different null hypotheses, find some common but also some complementing differentially expressed gene sets under specific settings. This demonstrates that due to complementing null hypotheses each test projects on different aspects of the data and for the analysis of biological data it is beneficial to use all three tests simultaneously instead of focusing exclusively on just one.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Chronic myelomonocytic leukaemia (CMML) is a heterogeneous haematopoietic disorder characterized by myeloproliferative or myelodysplastic features. At present, the pathogenesis of this malignancy is not completely understood. In this study, we sought to analyse gene expression profiles of CMML in order to characterize new molecular outcome predictors. A learning set of 32 untreated CMML patients at diagnosis was available for TaqMan low-density array gene expression analysis. From 93 selected genes related to cancer and cell cycle, we built a five-gene prognostic index after multiplicity correction. Using this index, we characterized two categories of patients with distinct overall survival (94% vs. 19% for good and poor overall survival, respectively; P = 0.007) and we successfully validated its strength on an independent cohort of 21 CMML patients with Affymetrix gene expression data. We found no specific patterns of association with traditional prognostic stratification parameters in the learning cohort. However, the poor survival group strongly correlated with high-risk treated patients and transformation to acute myeloid leukaemia. We report here a new multigene prognostic index for CMML, independent of the gene expression measurement method, which could be used as a powerful tool to predict clinical outcome and help physicians to evaluate criteria for treatments.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Urothelial cancer (UC) is highly recurrent and can progress from non-invasive (NMIUC) to a more aggressive muscle-invasive (MIUC) subtype that invades the muscle tissue layer of the bladder. We present a proof of principle study that network-based features of gene pairs can be used to improve classifier performance and the functional analysis of urothelial cancer gene expression data. In the first step of our procedure each individual sample of a UC gene expression dataset is inflated by gene pair expression ratios that are defined based on a given network structure. In the second step an elastic net feature selection procedure for network-based signatures is applied to discriminate between NMIUC and MIUC samples. We performed a repeated random subsampling cross validation in three independent datasets. The network signatures were characterized by a functional enrichment analysis and studied for the enrichment of known cancer genes. We observed that the network-based gene signatures from meta collections of proteinprotein interaction (PPI) databases such as CPDB and the PPI databases HPRD and BioGrid improved the classification performance compared to single gene based signatures. The network based signatures that were derived from PPI databases showed a prominent enrichment of cancer genes (e.g., TP53, TRIM27 and HNRNPA2Bl). We provide a novel integrative approach for large-scale gene expression analysis for the identification and development of novel diagnostical targets in bladder cancer. Further, our method allowed to link cancer gene associations to network-based expression signatures that are not observed in gene-based expression signatures.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Purpose: Our purpose in this report was to define genes and pathways dysregulated as a consequence of the t(4;14) in myeloma, and to gain insight into the downstream functional effects that may explain the different prognosis of this subgroup.Experimental Design: Fibroblast growth factor receptor 3 (FGFR3) overexpression, the presence of immunoglobulin heavy chain-multiple myeloma SET domain (IgH-MMSET) fusion products and the identification of t(4;14) breakpoints were determined in a series of myeloma cases. Differentially expressed genes were identified between cases with (n = 55) and without (n = 24) a t(4;14) by using global gene expression analysis.Results: Cases with a t(4;14) have a distinct expression pattern compared with other cases of myeloma. A total of 127 genes were identified as being differentially expressed including MMSET and cyclin D2, which have been previously reported as being associated with this translocation. Other important functional classes of genes include cell signaling, apoptosis and related genes, oncogenes, chromatin structure, and DNA repair genes. Interestingly, 25% of myeloma cases lacking evidence of this translocation had up-regulation of the MMSET transcript to the same level as cases with a translocation.Conclusions: t(4;14) cases form a distinct subgroup of myeloma cases with a unique gene signature that may account for their poor prognosis. A number of non-t(4;14) cases also express MMSET consistent with this gene playing a role in myeloma pathogenesis.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

BACKGROUND & AIMS:
Gastric cancer (GC) is a heterogeneous disease comprising multiple subtypes that have distinct biological properties and effects in patients. We sought to identify new, intrinsic subtypes of GC by gene expression analysis of a large panel of GC cell lines. We tested if these subtypes might be associated with differences in patient survival times and responses to various standard-of-care cytotoxic drugs.
METHODS:
We analyzed gene expression profiles for 37 GC cell lines to identify intrinsic GC subtypes. These subtypes were validated in primary tumors from 521 patients in 4 independent cohorts, where the subtypes were determined by either expression profiling or subtype-specific immunohistochemical markers (LGALS4, CDH17). In vitro sensitivity to 3 chemotherapy drugs (5-fluorouracil, cisplatin, oxaliplatin) was also assessed.
RESULTS:
Unsupervised cell line analysis identified 2 major intrinsic genomic subtypes (G-INT and G-DIF) that had distinct patterns of gene expression. The intrinsic subtypes, but not subtypes based on Lauren's histopathologic classification, were prognostic of survival, based on univariate and multivariate analysis in multiple patient cohorts. The G-INT cell lines were significantly more sensitive to 5-fluorouracil and oxaliplatin, but more resistant to cisplatin, than the G-DIF cell lines. In patients, intrinsic subtypes were associated with survival time following adjuvant, 5-fluorouracil-based therapy.
CONCLUSIONS:
Intrinsic subtypes of GC, based on distinct patterns of expression, are associated with patient survival and response to chemotherapy. Classification of GC based on intrinsic subtypes might be used to determine prognosis and customize therapy.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The host genotype has been proposed to contribute to individually composed bacterial communities in the gut. To provide deeper insight into interactions between gut bacteria and host, we associated germ-free C3H and C57BL/10 mice with intestinal bacteria from a C57BL/10 donor mouse. Analysis of microbiota similarity between the animals with denaturing gradient gel electrophoresis revealed the development of a mouse strain-specific microbiota. Microarray-based gene expression analysis in the colonic mucosa identified 202 genes whose expression differed significantly by a factor of more than 2. Application of bioinformatics tools demonstrated that functional terms including signaling/secretion, lipid degradation/catabolism, guanine nucleotide/guanylate binding and immune response were significantly enriched in differentially expressed genes. We had a closer look at the 56 genes with expression differences of more than 4 and observed a higher expression in C57BL/10 mice of the genes coding for Tlr1 and Ang4 which are involved in the recognition and response to gut bacteria. A higher expression of Pla2g2a was detected in C3H mice. In addition, a number of interferon-inducible genes were higher expressed in C3H than in C57BL/10 mice including Gbp1, Mal, Oasl2, Ifi202b, Rtp4, Ly6g6c, Ifi27l2a, Usp18, Ifit1, Ifi44, and Ly6g indicating that interferons may play an essential role in microbiota regulation. However, genes coding for interferons, their receptors, factors involved in interferon expression regulation or signaling pathways were not differentially expressed between the two mouse strains. Taken together, our study confirms that the host genotype is involved in the establishment of host-specific bacterial communities in the gut. Based on expression differences after colonization with the same bacterial inoculum, we propose that Pla2g2a and interferon-dependent genes may contribute to this phenomenon.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Background: Modern cancer research often involves large datasets and the use of sophisticated statistical techniques. Together these add a heavy computational load to the analysis, which is often coupled with issues surrounding data accessibility. Connectivity mapping is an advanced bioinformatic and computational technique dedicated to therapeutics discovery and drug re-purposing around differential gene expression analysis. On a normal desktop PC, it is common for the connectivity mapping task with a single gene signature to take >2h to complete using sscMap, a popular Java application that runs on standard CPUs (Central Processing Units). Here, we describe new software, cudaMap, which has been implemented using CUDA C/C++ to harness the computational power of NVIDIA GPUs (Graphics Processing Units) to greatly reduce processing times for connectivity mapping.

Results: cudaMap can identify candidate therapeutics from the same signature in just over thirty seconds when using an NVIDIA Tesla C2050 GPU. Results from the analysis of multiple gene signatures, which would previously have taken several days, can now be obtained in as little as 10 minutes, greatly facilitating candidate therapeutics discovery with high throughput. We are able to demonstrate dramatic speed differentials between GPU assisted performance and CPU executions as the computational load increases for high accuracy evaluation of statistical significance.

Conclusion: Emerging 'omics' technologies are constantly increasing the volume of data and information to be processed in all areas of biomedical research. Embracing the multicore functionality of GPUs represents a major avenue of local accelerated computing. cudaMap will make a strong contribution in the discovery of candidate therapeutics by enabling speedy execution of heavy duty connectivity mapping tasks, which are increasingly required in modern cancer research. cudaMap is open source and can be freely downloaded from http://purl.oclc.org/NET/cudaMap.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Identifying rare, highly penetrant risk mutations may be an important step in dissecting the molecular etiology of schizophrenia. We conducted a gene-based analysis of large (>100kb), rare copy number variants (CNVs) in the Wellcome Trust Case Control Consortium 2 (WTCCC2) schizophrenia sample of 1,564 cases and 1,748 controls all from Ireland, and further extended the analysis to include an additional 5,196 UK controls. We found association with duplications at chr20p12.2 (P=0.007) and evidence of replication in large independent European schizophrenia (P=0.052) and UK bipolar disorder case-control cohorts (P=0.047). A combined analysis of Irish/UK subjects including additional psychosis cases (schizophrenia and bipolar disorder) identified 22 carriers in 11,707 cases and 10 carriers in 21,204 controls (meta-analysis CMH P value=2x10(-4) (odds ratio (OR)=11.3, 95% CI=3.7, ∞)). Nineteen of the 22 cases and 8 of the 10 controls carried duplications starting at 9.68Mb with similar breakpoints across samples. By haplotype analysis and sequencing we identified a tandem ∼149kb duplication overlapping the gene p21 Protein-Activated Kinase 7 (PAK7, also called PAK5) which was in linkage disequilibrium with local haplotypes (P=2.5x10(-21)), indicative of a single ancestral duplication event. We confirmed the breakpoints in 8/8 carriers tested and found co-segregation of the duplication with illness in two additional family members of one of the affected probands. We demonstrate that PAK7 is developmentally co-expressed with another known psychosis risk gene (DISC1) suggesting a potential molecular mechanism involving aberrant synapse development and plasticity.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

BACKGROUND: Despite the significant progress made in colon cancer chemotherapy, advanced disease remains largely incurable and novel efficacious chemotherapies are urgently needed. Histone deacetylase inhibitors (HDACi) represent a novel class of agents which have demonstrated promising preclinical activity and are undergoing clinical evaluation in colon cancer. The goal of this study was to identify genes in colon cancer cells that are differentially regulated by two clinically advanced hydroxamic acid HDACi, vorinostat and LBH589 to provide rationale for novel drug combination partners and identify a core set of HDACi-regulated genes.

METHODS: HCT116 and HT29 colon cancer cells were treated with LBH589 or vorinostat and growth inhibition, acetylation status and apoptosis were analyzed in response to treatment using MTS, Western blotting and flow cytometric analyses. In addition, gene expression was analyzed using the Illumina Human-6 V2 BeadChip array and Ingenuity Pathway Analysis.

RESULTS: Treatment with either vorinostat or LBH589 rapidly induced histone acetylation, cell cycle arrest and inhibited the growth of both HCT116 and HT29 cells. Bioinformatic analysis of the microarray profiling revealed significant similarity in the genes altered in expression following treatment with the two HDACi tested within each cell line. However, analysis of genes that were altered in expression in the HCT116 and HT29 cells revealed cell-line-specific responses to HDACi treatment. In addition a core cassette of 11 genes modulated by both vorinostat and LBH589 were identified in both colon cancer cell lines analyzed.

CONCLUSION: This study identified HDACi-induced alterations in critical genes involved in nucleotide metabolism, angiogenesis, mitosis and cell survival which may represent potential intervention points for novel therapeutic combinations in colon cancer. This information will assist in the identification of novel pathways and targets that are modulated by HDACi, providing much-needed information on HDACi mechanism of action and providing rationale for novel drug combination partners. We identified a core signature of 11 genes which were modulated by both vorinostat and LBH589 in a similar manner in both cell lines. These core genes will assist in the development and validation of a common gene set which may represent a molecular signature of HDAC inhibition in colon cancer.