905 resultados para RNA sequencing


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Most fishes produce free-living embryos that are exposed to environmental stressors immediately following fertilization, including pathogenic microorganisms. Initial immune protection of embryos involves the chorion, as a protective barrier, and maternally-allocated antimicrobial compounds. At later developmental stages, host-genetic effects influence susceptibility and tolerance, suggesting a direct interaction between embryo genes and pathogens. So far, only a few host genes could be identified that correlate with embryonic survival under pathogen stress in salmonids. Here, we utilized high-throughput RNA-sequencing in order to describe the transcriptional response of a non-model fish, the Alpine whitefish Coregonus palaea, to infection, both in terms of host genes that are likely manipulated by the pathogen, and those involved in an early putative immune response. Embryos were produced in vitro, raised individually, and exposed at the late-eyed stage to a virulent strain of the opportunistic fish pathogen Pseudomonas fluorescens. The pseudomonad increased embryonic mortality and affected gene expression substantially. For example, essential, upregulated metabolic pathways in embryos under pathogen stress included ion binding pathways, aminoacyl-tRNA-biosynthesis, and the production of arginine and proline, most probably mediated by the pathogen for its proliferation. Most prominently downregulated transcripts comprised the biosynthesis of unsaturated fatty acids, the citrate cycle, and various isoforms of b-cell transcription factors. These factors have been shown to play a significant role in host blood cell differentiation and renewal. With regard to specific immune functions, differentially expressed transcripts mapped to the complement cascade, MHC class I and II, TNF-alpha, and T-cell differentiation proteins. The results of this study reveal insights into how P. fluorescens impairs the development of whitefish embryos and set a foundation for future studies investigating host pathogen interactions in fish embryos.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

With the advent of high through-put sequencing (HTS), the emerging science of metagenomics is transforming our understanding of the relationships of microbial communities with their environments. While metagenomics aims to catalogue the genes present in a sample through assessing which genes are actively expressed, metatranscriptomics can provide a mechanistic understanding of community inter-relationships. To achieve these goals, several challenges need to be addressed from sample preparation to sequence processing, statistical analysis and functional annotation. Here we use an inbred non-obese diabetic (NOD) mouse model in which germ-free animals were colonized with a defined mixture of eight commensal bacteria, to explore methods of RNA extraction and to develop a pipeline for the generation and analysis of metatranscriptomic data. Applying the Illumina HTS platform, we sequenced 12 NOD cecal samples prepared using multiple RNA-extraction protocols. The absence of a complete set of reference genomes necessitated a peptide-based search strategy. Up to 16% of sequence reads could be matched to a known bacterial gene. Phylogenetic analysis of the mapped ORFs revealed a distribution consistent with ribosomal RNA, the majority from Bacteroides or Clostridium species. To place these HTS data within a systems context, we mapped the relative abundance of corresponding Escherichia coli homologs onto metabolic and protein-protein interaction networks. These maps identified bacterial processes with components that were well-represented in the datasets. In summary this study highlights the potential of exploiting the economy of HTS platforms for metatranscriptomics.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

My dissertation focuses on two aspects of RNA sequencing technology. The first is the methodology for modeling the overdispersion inherent in RNA-seq data for differential expression analysis. This aspect is addressed in three sections. The second aspect is the application of RNA-seq data to identify the CpG island methylator phenotype (CIMP) by integrating datasets of mRNA expression level and DNA methylation status. Section 1: The cost of DNA sequencing has reduced dramatically in the past decade. Consequently, genomic research increasingly depends on sequencing technology. However it remains elusive how the sequencing capacity influences the accuracy of mRNA expression measurement. We observe that accuracy improves along with the increasing sequencing depth. To model the overdispersion, we use the beta-binomial distribution with a new parameter indicating the dependency between overdispersion and sequencing depth. Our modified beta-binomial model performs better than the binomial or the pure beta-binomial model with a lower false discovery rate. Section 2: Although a number of methods have been proposed in order to accurately analyze differential RNA expression on the gene level, modeling on the base pair level is required. Here, we find that the overdispersion rate decreases as the sequencing depth increases on the base pair level. Also, we propose four models and compare them with each other. As expected, our beta binomial model with a dynamic overdispersion rate is shown to be superior. Section 3: We investigate biases in RNA-seq by exploring the measurement of the external control, spike-in RNA. This study is based on two datasets with spike-in controls obtained from a recent study. We observe an undiscovered bias in the measurement of the spike-in transcripts that arises from the influence of the sample transcripts in RNA-seq. Also, we find that this influence is related to the local sequence of the random hexamer that is used in priming. We suggest a model of the inequality between samples and to correct this type of bias. Section 4: The expression of a gene can be turned off when its promoter is highly methylated. Several studies have reported that a clear threshold effect exists in gene silencing that is mediated by DNA methylation. It is reasonable to assume the thresholds are specific for each gene. It is also intriguing to investigate genes that are largely controlled by DNA methylation. These genes are called “L-shaped” genes. We develop a method to determine the DNA methylation threshold and identify a new CIMP of BRCA. In conclusion, we provide a detailed understanding of the relationship between the overdispersion rate and sequencing depth. And we reveal a new bias in RNA-seq and provide a detailed understanding of the relationship between this new bias and the local sequence. Also we develop a powerful method to dichotomize methylation status and consequently we identify a new CIMP of breast cancer with a distinct classification of molecular characteristics and clinical features.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Thesis (Ph.D.)--University of Washington, 2016-06

Relevância:

70.00% 70.00%

Publicador:

Resumo:

A precise, reproducible deletion made during in vitro reverse transcription of RNA2 from the icosahedral positive-stranded Helicoverpa armigera stunt virus (Tetraviridae) is described. The deletion, located between two hexamer repeats, is a 50-base sequence that includes one copy of the hexamer repeat. Only the Moloney murine leukemia virus reverse transcriptase and its derivative Superscript I, carrying a deletion of the carboxy-terminal RNase H region, showed this response, indicating a template-switching mechanism different from one proposed that involves a RNase H-dependent strand transfer, Superscript II, however, which carries point mutations to reduce RNase H activity, does not cause a deletion. A possible mechanism involves the enzyme pausing at the 3' side of a stem-loop structure and the 3' end of the nascent DNA strand separating from the template and reannealing to the upstream hexamer repeat.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Candida albicans and Candida dubliniensis are pathogenic fungi that are highly related but differ in virulence and in some phenotypic traits. During in vitro growth on certain nutrient-poor media, C. albicans and C. dubliniensis are the only yeast species which are able to produce chlamydospores, large thick-walled cells of unknown function. Interestingly, only C. dubliniensis forms pseudohyphae with abundant chlamydospores when grown on Staib medium, while C. albicans grows exclusively as a budding yeast. In order to further our understanding of chlamydospore development and assembly, we compared the global transcriptional profile of both species during growth in liquid Staib medium by RNA sequencing. We also included a C. albicans mutant in our study which lacks the morphogenetic transcriptional repressor Nrg1. This strain, which is characterized by its constitutive pseudohyphal growth, specifically produces masses of chlamydospores in Staib medium, similar to C. dubliniensis. This comparative approach identified a set of putatively chlamydospore-related genes. Two of the homologous C. albicans and C. dubliniensis genes (CSP1 and CSP2) which were most strongly upregulated during chlamydospore development were analysed in more detail. By use of the green fluorescent protein as a reporter, the encoded putative cell wall related proteins were found to exclusively localize to C. albicans and C. dubliniensis chlamydospores. Our findings uncover the first chlamydospore specific markers in Candida species and provide novel insights in the complex morphogenetic development of these important fungal pathogens.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Within the ENCODE Consortium, GENCODE aimed to accurately annotate all protein-coding genes, pseudogenes, and noncoding transcribed loci in the human genome through manual curation and computational methods. Annotated transcript structures were assessed, and less well-supported loci were systematically, experimentally validated. Predicted exon-exon junctions were evaluated by RT-PCR amplification followed by highly multiplexed sequencing readout, a method we called RT-PCR-seq. Seventy-nine percent of all assessed junctions are confirmed by this evaluation procedure, demonstrating the high quality of the GENCODE gene set. RT-PCR-seq was also efficient to screen gene models predicted using the Human Body Map (HBM) RNA-seq data. We validated 73% of these predictions, thus confirming 1168 novel genes, mostly noncoding, which will further complement the GENCODE annotation. Our novel experimental validation pipeline is extremely sensitive, far more than unbiased transcriptome profiling through RNA sequencing, which is becoming the norm. For example, exon-exon junctions unique to GENCODE annotated transcripts are five times more likely to be corroborated with our targeted approach than with extensive large human transcriptome profiling. Data sets such as the HBM and ENCODE RNA-seq data fail sampling of low-expressed transcripts. Our RT-PCR-seq targeted approach also has the advantage of identifying novel exons of known genes, as we discovered unannotated exons in ~11% of assessed introns. We thus estimate that at least 18% of known loci have yet-unannotated exons. Our work demonstrates that the cataloging of all of the genic elements encoded in the human genome will necessitate a coordinated effort between unbiased and targeted approaches, like RNA-seq and RT-PCR-seq.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Background MicroRNAs (miRNAs) are short non-coding regulatory RNAs that control gene expression usually producing translational repression and gene silencing. High-throughput sequencing technologies have revealed heterogeneity at length and sequence level for the majority of mature miRNAs (IsomiRs). Most isomiRs can be explained by variability in either Dicer1 or Drosha cleavage during miRNA biogenesis at 5" or 3" of the miRNA (trimming variants). Although isomiRs have been described in different tissues and organisms, their functional validation as modulators of gene expression remains elusive. Here we have characterized the expression and function of a highly abundant miR-101 5"-trimming variant (5"-isomiR-101). Results The analysis of small RNA sequencing data in several human tissues and cell lines indicates that 5"-isomiR-101 is ubiquitously detected and a highly abundant, especially in the brain. 5"- isomiR-101 was found in Ago-2 immunocomplexes and complementary approaches showed that 5"-isomiR-101 interacted with different members of the silencing (RISC) complex. In addition, 5"-isomiR-101 decreased the expression of five validated miR-101 targets, suggesting that it is a functional variant. Both the binding to RISC members and the degree of silencing were less efficient for 5"-isomiR-101 compared with miR-101. For some targets, both miR-101 and 5"-isomiR-101 significantly decreased protein expression with no changes in the respective mRNA levels. Although a high number of overlapping predicted targets suggest similar targeted biological pathways, a correlation analysis of the expression profiles of miR-101 variants and predicted mRNA targets in human brains at different ages, suggest specific functions for miR-101- and 5"-isomiR-101. Conclusions These results suggest that isomiRs are functional variants and further indicate that for a given miRNA, the different isomiRs may contribute to the overall effect as quantitative and qualitative fine-tuners of gene expression.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

UNLABELLED: In vivo transcriptional analyses of microbial pathogens are often hampered by low proportions of pathogen biomass in host organs, hindering the coverage of full pathogen transcriptome. We aimed to address the transcriptome profiles of Candida albicans, the most prevalent fungal pathogen in systemically infected immunocompromised patients, during systemic infection in different hosts. We developed a strategy for high-resolution quantitative analysis of the C. albicans transcriptome directly from early and late stages of systemic infection in two different host models, mouse and the insect Galleria mellonella. Our results show that transcriptome sequencing (RNA-seq) libraries were enriched for fungal transcripts up to 1,600-fold using biotinylated bait probes to capture C. albicans sequences. This enrichment biased the read counts of only ~3% of the genes, which can be identified and removed based on a priori criteria. This allowed an unprecedented resolution of C. albicans transcriptome in vivo, with detection of over 86% of its genes. The transcriptional response of the fungus was surprisingly similar during infection of the two hosts and at the two time points, although some host- and time point-specific genes could be identified. Genes that were highly induced during infection were involved, for instance, in stress response, adhesion, iron acquisition, and biofilm formation. Of the in vivo-regulated genes, 10% are still of unknown function, and their future study will be of great interest. The fungal RNA enrichment procedure used here will help a better characterization of the C. albicans response in infected hosts and may be applied to other microbial pathogens. IMPORTANCE: Understanding the mechanisms utilized by pathogens to infect and cause disease in their hosts is crucial for rational drug development. Transcriptomic studies may help investigations of these mechanisms by determining which genes are expressed specifically during infection. This task has been difficult so far, since the proportion of microbial biomass in infected tissues is often extremely low, thus limiting the depth of sequencing and comprehensive transcriptome analysis. Here, we adapted a technology to capture and enrich C. albicans RNA, which was next used for deep RNA sequencing directly from infected tissues from two different host organisms. The high-resolution transcriptome revealed a large number of genes that were so far unknown to participate in infection, which will likely constitute a focus of study in the future. More importantly, this method may be adapted to perform transcript profiling of any other microbes during host infection or colonization.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Breast cancer is the most common diagnosed cancer and the leading cause of cancer death among females worldwide. It is considered a highly heterogeneous disease and it must be classified into more homogeneous groups. Hence, the purpose of this study was to classify breast tumors based on variations in gene expression patterns derived from RNA sequencing by using different class discovery methods. 42 breast tumors paired-samples were sequenced by Illumine Genome Analyzer and the data was analyzed and prepared by TopHat2 and htseq-count. As reported previously, breast cancer could be grouped into five main groups known as basal epithelial-like group, HER2 group, normal breast-like group and two Luminal groups with a distinctive expression profile. Classifying breast tumor samples by using PAM50 method, the most common subtype was Luminal B and was significantly associated with ESR1 and ERBB2 high expression. Luminal A subtype had ESR1 and SLC39A6 significant high expression, whereas HER2 subtype had a high expression of ERBB2 and CNNE1 genes and low luminal epithelial gene expression. Basal-like and normal-like subtypes were associated with low expression of ESR1, PgR and HER2, and had significant high expression of cytokeratins 5 and 17. Our results were similar compared with TGCA breast cancer data results and with known studies related with breast cancer classification. Classifying breast tumors could add significant prognostic and predictive information to standard parameters, and moreover, identify marker genes for each subtype to find a better therapy for patients with breast cancer.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq)

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Schistosoma mansoni is one of the agents of schistosomiasis, a chronic and debilitating disease. Here we, present a transcriptome-wide characterization of adult S. mansoni males by high-throughput RNA-sequencing. We obtained 1,620,432 high-quality ESTs from a directional strand-specific cDNA library, resulting in a 26% higher coverage of genome bases than that of the public ESTs available at NCBI. With a 15 x-deep coverage of transcribed genomic regions, our data were able to (i) confirm for the first time 990 predictions without previous evidence of transcription; (ii) correct gene predictions; (iii) discover 989 and 1196 RNA-seq contigs that map to intergenic and intronic genomic regions, respectively, where no gene had been predicted before. These contigs could represent new protein-coding genes or non-coding RNAs (ncRNAs). Interestingly, we identified 11 novel Micro-exon genes (MEGs). These data reveal new features of the S. mansoni transcriptional landscape and significantly advance our understanding of the parasite transcriptome. (c) 2011 Elsevier Inc. All rights reserved.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

The ability to measure gene expression on a genome-wide scale is one of the most promising accomplishments in molecular biology. Microarrays, the technology that first permitted this, were riddled with problems due to unwanted sources of variability. Many of these problems are now mitigated, after a decade’s worth of statistical methodology development. The recently developed RNA sequencing (RNA-seq) technology has generated much excitement in part due to claims of reduced variability in comparison to microarrays. However, we show RNA-seq data demonstrates unwanted and obscuring variability similar to what was first observed in microarrays. In particular, we find GC-content has a strong sample specific effect on gene expression measurements that, if left uncorrected, leads to false positives in downstream results. We also report on commonly observed data distortions that demonstrate the need for data normalization. Here we describe statistical methodology that improves precision by 42% without loss of accuracy. Our resulting conditional quantile normalization (CQN) algorithm combines robust generalized regression to remove systematic bias introduced by deterministic features such as GC-content, and quantile normalization to correct for global distortions.