2 resultados para The cancer genome atlas
em Cambridge University Engineering Department Publications Database
Identifying cancer subtypes in glioblastoma by combining genomic, transcriptomic and epigenomic data
Resumo:
We present a nonparametric Bayesian method for disease subtype discovery in multi-dimensional cancer data. Our method can simultaneously analyse a wide range of data types, allowing for both agreement and disagreement between their underlying clustering structure. It includes feature selection and infers the most likely number of disease subtypes, given the data. We apply the method to 277 glioblastoma samples from The Cancer Genome Atlas, for which there are gene expression, copy number variation, methylation and microRNA data. We identify 8 distinct consensus subtypes and study their prognostic value for death, new tumour events, progression and recurrence. The consensus subtypes are prognostic of tumour recurrence (log-rank p-value of $3.6 \times 10^{-4}$ after correction for multiple hypothesis tests). This is driven principally by the methylation data (log-rank p-value of $2.0 \times 10^{-3}$) but the effect is strengthened by the other 3 data types, demonstrating the value of integrating multiple data types. Of particular note is a subtype of 47 patients characterised by very low levels of methylation. This subtype has very low rates of tumour recurrence and no new events in 10 years of follow up. We also identify a small gene expression subtype of 6 patients that shows particularly poor survival outcomes. Additionally, we note a consensus subtype that showly a highly distinctive data signature and suggest that it is therefore a biologically distinct subtype of glioblastoma. The code is available from https://sites.google.com/site/multipledatafusion/
Resumo:
The Arabidopsis genome contains a highly complex and abundant population of small RNAs, and many of the endogenous siRNAs are dependent on RNA-Dependent RNA Polymerase 2 (RDR2) for their biogenesis. By analyzing an rdr2 loss-of-function mutant using two different parallel sequencing technologies, MPSS and 454, we characterized the complement of miRNAs expressed in Arabidopsis inflorescence to considerable depth. Nearly all known miRNAs were enriched in this mutant and we identified 13 new miRNAs, all of which were relatively low abundance and constitute new families. Trans-acting siRNAs (ta-siRNAs) were even more highly enriched. Computational and gel blot analyses suggested that the minimal number of miRNAs in Arabidopsis is approximately 155. The size profile of small RNAs in rdr2 reflected enrichment of 21-nt miRNAs and other classes of siRNAs like ta-siRNAs, and a significant reduction in 24-nt heterochromatic siRNAs. Other classes of small RNAs were found to be RDR2-independent, particularly those derived from long inverted repeats and a subset of tandem repeats. The small RNA populations in other Arabidopsis small RNA biogenesis mutants were also examined; a dcl2/3/4 triple mutant showed a similar pattern to rdr2, whereas dcl1-7 and rdr6 showed reductions in miRNAs and ta-siRNAs consistent with their activities in the biogenesis of these types of small RNAs. Deep sequencing of mutants provides a genetic approach for the dissection and characterization of diverse small RNA populations and the identification of low abundance miRNAs.