4 resultados para Sequencing data
Resumo:
The splicing factor SF3B1 is the most frequently mutated gene in myelodysplastic syndromes (MDS), and is strongly associated with the presence of ring sideroblasts (RS). We have performed a systematic analysis of cryptic splicing abnormalities from RNA sequencing data on hematopoietic stem cells (HSCs) of SF3B1-mutant MDS cases with RS. Aberrant splicing events in many downstream target genes were identified and cryptic 3' splice site usage was a frequent event in SF3B1-mutant MDS. The iron transporter ABCB7 is a well-recognized candidate gene showing marked downregulation in MDS with RS. Our analysis unveiled aberrant ABCB7 splicing, due to usage of an alternative 3' splice site in MDS patient samples, giving rise to a premature termination codon in the ABCB7 mRNA. Treatment of cultured SF3B1-mutant MDS erythroblasts and a CRISPR/Cas9-generated SF3B1-mutant cell line with the nonsense-mediated decay (NMD) inhibitor cycloheximide showed that the aberrantly spliced ABCB7 transcript is targeted by NMD. We describe cryptic splicing events in the HSCs of SF3B1-mutant MDS, and our data support a model in which NMD-induced downregulation of the iron exporter ABCB7 mRNA transcript resulting from aberrant splicing caused by mutant SF3B1 underlies the increased mitochondrial iron accumulation found in MDS patients with RS.Leukemia advance online publication, 17 June 2016; doi:10.1038/leu.2016.149.
Resumo:
Chromatin immunoprecipitation (ChIP) provides a means of enriching DNA associated with transcription factors, histone modifications, and indeed any other proteins for which suitably characterized antibodies are available. Over the years, sequence detection has progressed from quantitative real-time PCR and Southern blotting to microarrays (ChIP-chip) and now high-throughput sequencing (ChIP-seq). This progression has vastly increased the sequence coverage and data volumes generated. This in turn has enabled informaticians to predict the identity of multi-protein complexes on DNA based on the overrepresentation of sequence motifs in DNA enriched by ChIP with a single antibody against a single protein. In the course of the development of high-throughput sequencing, little has changed in the ChIP methodology until recently. In the last three years, a number of modifications have been made to the ChIP protocol with the goal of enhancing the sensitivity of the method and further reducing the levels of nonspecific background sequences in ChIPped samples. In this chapter, we provide a brief commentary on these methodological changes and describe a detailed ChIP-exo method able to generate narrower peaks and greater peak coverage from ChIPped material.
Resumo:
Here, we describe gene expression compositional assignment (GECA), a powerful, yet simple method based on compositional statistics that can validate the transfer of prior knowledge, such as gene lists, into independent data sets, platforms and technologies. Transcriptional profiling has been used to derive gene lists that stratify patients into prognostic molecular subgroups and assess biomarker performance in the pre-clinical setting. Archived public data sets are an invaluable resource for subsequent in silico validation, though their use can lead to data integration issues. We show that GECA can be used without the need for normalising expression levels between data sets and can outperform rank-based correlation methods. To validate GECA, we demonstrate its success in the cross-platform transfer of gene lists in different domains including: bladder cancer staging, tumour site of origin and mislabelled cell lines. We also show its effectiveness in transferring an epithelial ovarian cancer prognostic gene signature across technologies, from a microarray to a next-generation sequencing setting. In a final case study, we predict the tumour site of origin and histopathology of epithelial ovarian cancer cell lines. In particular, we identify and validate the commonly-used cell line OVCAR-5 as non-ovarian, being gastrointestinal in origin. GECA is available as an open-source R package.
Resumo:
DNA sequencing is now faster and cheaper than ever before, due to the development of next generation sequencing (NGS) technologies. NGS is now widely used in the research setting and is becoming increasingly utilised in clinical practice. However, due to evolving clinical commitments, increased workload and lack of training opportunities, many oncologists may be unfamiliar with the terminology and technology involved. This can lead to oncologists feeling daunted by issues such as how to interpret the vast amounts of data generated by NGS and the differences between sequencing platforms. This review article explains common concepts and terminology, summarises the process of DNA sequencing (including data analysis) and discusses the main factors to consider when deciding on a sequencing method. This article aims to improve oncologists' understanding of the most commonly used sequencing platforms and the ongoing challenges faced in expanding the use of NGS into routine clinical practice.