11 resultados para RNA analysis

em Duke University


Relevância:

40.00% 40.00%

Publicador:

Resumo:

Cryptococcus neoformans is a pathogenic basidiomycetous yeast responsible for more than 600,000 deaths each year. It occurs as two serotypes (A and D) representing two varieties (i.e. grubii and neoformans, respectively). Here, we sequenced the genome and performed an RNA-Seq-based analysis of the C. neoformans var. grubii transcriptome structure. We determined the chromosomal locations, analyzed the sequence/structural features of the centromeres, and identified origins of replication. The genome was annotated based on automated and manual curation. More than 40,000 introns populating more than 99% of the expressed genes were identified. Although most of these introns are located in the coding DNA sequences (CDS), over 2,000 introns in the untranslated regions (UTRs) were also identified. Poly(A)-containing reads were employed to locate the polyadenylation sites of more than 80% of the genes. Examination of the sequences around these sites revealed a new poly(A)-site-associated motif (AUGHAH). In addition, 1,197 miscRNAs were identified. These miscRNAs can be spliced and/or polyadenylated, but do not appear to have obvious coding capacities. Finally, this genome sequence enabled a comparative analysis of strain H99 variants obtained after laboratory passage. The spectrum of mutations identified provides insights into the genetics underlying the micro-evolution of a laboratory strain, and identifies mutations involved in stress responses, mating efficiency, and virulence.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

BACKGROUND: Over the past two decades more than fifty thousand unique clinical and biological samples have been assayed using the Affymetrix HG-U133 and HG-U95 GeneChip microarray platforms. This substantial repository has been used extensively to characterize changes in gene expression between biological samples, but has not been previously mined en masse for changes in mRNA processing. We explored the possibility of using HG-U133 microarray data to identify changes in alternative mRNA processing in several available archival datasets. RESULTS: Data from these and other gene expression microarrays can now be mined for changes in transcript isoform abundance using a program described here, SplicerAV. Using in vivo and in vitro breast cancer microarray datasets, SplicerAV was able to perform both gene and isoform specific expression profiling within the same microarray dataset. Our reanalysis of Affymetrix U133 plus 2.0 data generated by in vitro over-expression of HRAS, E2F3, beta-catenin (CTNNB1), SRC, and MYC identified several hundred oncogene-induced mRNA isoform changes, one of which recognized a previously unknown mechanism of EGFR family activation. Using clinical data, SplicerAV predicted 241 isoform changes between low and high grade breast tumors; with changes enriched among genes coding for guanyl-nucleotide exchange factors, metalloprotease inhibitors, and mRNA processing factors. Isoform changes in 15 genes were associated with aggressive cancer across the three breast cancer datasets. CONCLUSIONS: Using SplicerAV, we identified several hundred previously uncharacterized isoform changes induced by in vitro oncogene over-expression and revealed a previously unknown mechanism of EGFR activation in human mammary epithelial cells. We analyzed Affymetrix GeneChip data from over 400 human breast tumors in three independent studies, making this the largest clinical dataset analyzed for en masse changes in alternative mRNA processing. The capacity to detect RNA isoform changes in archival microarray data using SplicerAV allowed us to carry out the first analysis of isoform specific mRNA changes directly associated with cancer survival.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

BACKGROUND: Many analyses of microarray association studies involve permutation, bootstrap resampling and cross-validation, that are ideally formulated as embarrassingly parallel computing problems. Given that these analyses are computationally intensive, scalable approaches that can take advantage of multi-core processor systems need to be developed. RESULTS: We have developed a CUDA based implementation, permGPU, that employs graphics processing units in microarray association studies. We illustrate the performance and applicability of permGPU within the context of permutation resampling for a number of test statistics. An extensive simulation study demonstrates a dramatic increase in performance when using permGPU on an NVIDIA GTX 280 card compared to an optimized C/C++ solution running on a conventional Linux server. CONCLUSIONS: permGPU is available as an open-source stand-alone application and as an extension package for the R statistical environment. It provides a dramatic increase in performance for permutation resampling analysis in the context of microarray association studies. The current version offers six test statistics for carrying out permutation resampling analyses for binary, quantitative and censored time-to-event traits.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Tumor microenvironmental stresses, such as hypoxia and lactic acidosis, play important roles in tumor progression. Although gene signatures reflecting the influence of these stresses are powerful approaches to link expression with phenotypes, they do not fully reflect the complexity of human cancers. Here, we describe the use of latent factor models to further dissect the stress gene signatures in a breast cancer expression dataset. The genes in these latent factors are coordinately expressed in tumors and depict distinct, interacting components of the biological processes. The genes in several latent factors are highly enriched in chromosomal locations. When these factors are analyzed in independent datasets with gene expression and array CGH data, the expression values of these factors are highly correlated with copy number alterations (CNAs) of the corresponding BAC clones in both the cell lines and tumors. Therefore, variation in the expression of these pathway-associated factors is at least partially caused by variation in gene dosage and CNAs among breast cancers. We have also found the expression of two latent factors without any chromosomal enrichment is highly associated with 12q CNA, likely an instance of "trans"-variations in which CNA leads to the variations in gene expression outside of the CNA region. In addition, we have found that factor 26 (1q CNA) is negatively correlated with HIF-1alpha protein and hypoxia pathways in breast tumors and cell lines. This agrees with, and for the first time links, known good prognosis associated with both a low hypoxia signature and the presence of CNA in this region. Taken together, these results suggest the possibility that tumor segmental aneuploidy makes significant contributions to variation in the lactic acidosis/hypoxia gene signatures in human cancers and demonstrate that latent factor analysis is a powerful means to uncover such a linkage.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

BACKGROUND: West Virginia has the worst oral health in the United States, but the reasons for this are unclear. This pilot study explored the etiology of this disparity using culture-independent analyses to identify bacterial species associated with oral disease. METHODS: Bacteria in subgingival plaque samples from twelve participants in two independent West Virginia dental-related studies were characterized using 16S rRNA gene sequencing and Human Oral Microbe Identification Microarray (HOMIM) analysis. Unifrac analysis was used to characterize phylogenetic differences between bacterial communities obtained from plaque of participants with low or high oral disease, which was further evaluated using clustering and Principal Coordinate Analysis. RESULTS: Statistically different bacterial signatures (P<0.001) were identified in subgingival plaque of individuals with low or high oral disease in West Virginia based on 16S rRNA gene sequencing. Low disease contained a high frequency of Veillonella and Streptococcus, with a moderate number of Capnocytophaga. High disease exhibited substantially increased bacterial diversity and included a large proportion of Clostridiales cluster bacteria (Selenomonas, Eubacterium, Dialister). Phylogenetic trees constructed using 16S rRNA gene sequencing revealed that Clostridiales were repeated colonizers in plaque associated with high oral disease, providing evidence that the oral environment is somehow influencing the bacterial signature linked to disease. CONCLUSIONS: Culture-independent analyses identified an atypical bacterial signature associated with high oral disease in West Virginians and provided evidence that the oral environment influenced this signature. Both findings provide insight into the etiology of the oral disparity in West Virginia.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

With increasing recognition of the roles RNA molecules and RNA/protein complexes play in an unexpected variety of biological processes, understanding of RNA structure-function relationships is of high current importance. To make clean biological interpretations from three-dimensional structures, it is imperative to have high-quality, accurate RNA crystal structures available, and the community has thoroughly embraced that goal. However, due to the many degrees of freedom inherent in RNA structure (especially for the backbone), it is a significant challenge to succeed in building accurate experimental models for RNA structures. This chapter describes the tools and techniques our research group and our collaborators have developed over the years to help RNA structural biologists both evaluate and achieve better accuracy. Expert analysis of large, high-resolution, quality-conscious RNA datasets provides the fundamental information that enables automated methods for robust and efficient error diagnosis in validating RNA structures at all resolutions. The even more crucial goal of correcting the diagnosed outliers has steadily developed toward highly effective, computationally based techniques. Automation enables solving complex issues in large RNA structures, but cannot circumvent the need for thoughtful examination of local details, and so we also provide some guidance for interpreting and acting on the results of current structure validation for RNA.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Olfactory sensory neurons (OSNs), which detect a myriad of odorants, are known to express one allele of one olfactory receptor (OR) gene (Olfr) from the largest gene family in the mammalian genome. The OSNs expressing the same OR project their axons to the main olfactory bulb where they converge to form glomeruli. This “One neuron-one receptor rule” makes the olfactory epithelium (OE), which consists of a vast number of OSNs expressing unique ORs, one of the most heterogeneous cell populations. However, the mechanism of how the single OR allele is chosen remains unclear along with the question of whether one OSN only expresses a single OR gene, a hypothesis that has not been rigorously verified while we performed the experiments. Moreover, failure of axonal targeting to single glomerulus was observed in MeCP2 deficient OSNs where delayed development was proposed as an explanation for the phenotype. How Mecp2 mutation caused this aberrant targeting is not entirely understood.

In this dissertation, we explored the transcriptomes of single and mature OSNs by single-cell RNA-Seq to reveal their heterogeneity and further studied the OR gene expression from these isolated OSNs. The singularity of sequenced OSNs was ensured by the observation of monoallelic expression of X-linked genes from the hybrid samples from crosses between mice of different strains where strain-specific polymorphisms could be used to track the allelic origins of SNP-containing reads. The clustering of expression profiles from triplicates that originated from the same cell assured that the transcriptomic identities of OSNs were maintained through the experimental process. The average gene expression profiles of sequenced OSNs correlated well to the conventional transcriptome data of FACS-sorted Omp-positive cells, and the top-ranked expression of OR was conceded in the single-OSN transcriptomes. While exploring cellular diversity, in addition to OR genes, we revealed nearly 200 differentially expressed genes among the sequenced OSNs in this study. Among the 36 sequenced OSNs, eight cells (22.2%) showed multiple OR gene expression and the presences of additional ORs were not restricted to the neighbor loci that shared the transcriptional effect of the primary OR expression, suggesting that the “One neuron-one receptor rule” might not be strictly true at the transcription level. All of the inferable ORs, including additional co-expressed ORs, were shown to be monoallelic. Our sequencing of 21 Mecp2308 mutant OSNs, of which 62% expressed more than one OR genes, and the expression levels of the additional ORs were significantly higher than those in the wild-type, suggested that MeCP2 plays a role in the regulation of singular OR gene expression. Dual label in situ hybridization along with the sequence data revealed that dorsal and ventral ORs were co-expressed in the same Mecp2 mutant OSN, further implying that MeCP2 might be involved in regulation of OR territories in the OE. Our results suggested a new role of MeCP2 in OR gene choice and ratified that this multiple-OR expression caused by Mecp2 mutation did not accompany delayed OSN development that has been observed in the previous studies on the Mecp2 mutants.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The Arabidopsis root apical meristem (RAM) is a complex tissue capable of generating all the cell types that ultimately make up the root. The work presented in this thesis takes advantage of the versatility of high-throughput sequencing to address two independent questions about the root meristem. Although a lot of information is known regarding the cell fate decisions that occur at the RAM, cortex specification and differentiation remain poorly understood. In the first part of this thesis, I used an ethylmethanesulfonate (EMS) mutagenized marker line to perform a forward genetics screen. The goal of this screen was to identify novel genes involved in the specification and differentiation of the cortex tissue. Mapping analysis from the results obtained in this screen revealed a new allele of BRASSINOSTEROID4 with abnormal marker expression in the cortex tissue. Although this allele proved to be non-cortex specific, this project highlights new technology that allows mapping of EMS-generated mutations without the need to map-cross or back-cross. In the second part of this thesis, using fluorescence activated cell sorting (FACS) coupled with high throughput sequencing, my collaborators and I generated single-base resolution whole genome DNA methylomes, mRNA transcriptomes, and smallRNA transcriptomes for six different populations of cell types in the Arabidopsis root meristem. We were able to discover that the columella is hypermethylated in the CHH context within transposable elements. This hypermethylation is accompanied by upregulation of the RNA-dependent DNA methylation pathway (RdDM), including higher levels of 24-nt silencing RNAs (siRNAs). In summary, our studies demonstrate the versatility of high-throughput sequencing as a method for identifying single mutations or to perform complex comparative genomic analyses.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

BACKGROUND: Enterotoxigenic Escherichia coli (ETEC) is a globally prevalent cause of diarrhea. Though usually self-limited, it can be severe and debilitating. Little is known about the host transcriptional response to infection. We report the first gene expression analysis of the human host response to experimental challenge with ETEC. METHODS: We challenged 30 healthy adults with an unattenuated ETEC strain, and collected serial blood samples shortly after inoculation and daily for 8 days. We performed gene expression analysis on whole peripheral blood RNA samples from subjects in whom severe symptoms developed (n = 6) and a subset of those who remained asymptomatic (n = 6) despite shedding. RESULTS: Compared with baseline, symptomatic subjects demonstrated significantly different expression of 406 genes highlighting increased immune response and decreased protein synthesis. Compared with asymptomatic subjects, symptomatic subjects differentially expressed 254 genes primarily associated with immune response. This comparison also revealed 29 genes differentially expressed between groups at baseline, suggesting innate resilience to infection. Drug repositioning analysis identified several drug classes with potential utility in augmenting immune response or mitigating symptoms. CONCLUSIONS: There are statistically significant and biologically plausible differences in host gene expression induced by ETEC infection. Differential baseline expression of some genes may indicate resilience to infection.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Mitotic genome instability can occur during the repair of double-strand breaks (DSBs) in DNA, which arise from endogenous and exogenous sources. Studying the mechanisms of DNA repair in the budding yeast, Saccharomyces cerevisiae has shown that Homologous Recombination (HR) is a vital repair mechanism for DSBs. HR can result in a crossover event, in which the broken molecule reciprocally exchanges information with a homologous repair template. The current model of double-strand break repair (DSBR) also allows for a tract of information to non-reciprocally transfer from the template molecule to the broken molecule. These “gene conversion” events can vary in size and can occur in conjunction with a crossover event or in isolation. The frequency and size of gene conversions in isolation and gene conversions associated with crossing over has been a source of debate due to the variation in systems used to detect gene conversions and the context in which the gene conversions are measured.

In Chapter 2, I use an unbiased system that measures the frequency and size of gene conversion events, as well as the association of gene conversion events with crossing over between homologs in diploid yeast. We show mitotic gene conversions occur at a rate of 1.3x10-6 per cell division, are either large (median 54.0kb) or small (median 6.4kb), and are associated with crossing over 43% of the time.

DSBs can arise from endogenous cellular processes such as replication and transcription. Two important RNA/DNA hybrids are involved in replication and transcription: R-loops, which form when an RNA transcript base pairs with the DNA template and displaces the non-template DNA strand, and ribonucleotides embedded into DNA (rNMPs), which arise when replicative polymerase errors insert ribonucleotide instead of deoxyribonucleotide triphosphates. RNaseH1 (encoded by RNH1) and RNaseH2 (whose catalytic subunit is encoded by RNH201) both recognize and degrade the RNA in within R-loops while RNaseH2 alone recognizes, nicks, and initiates removal of rNMPs embedded into DNA. Due to their redundant abilities to act on RNA:DNA hybrids, aberrant removal of rNMPs from DNA has been thought to lead to genome instability in an rnh201Δ background.

In Chapter 3, I characterize (1) non-selective genome-wide homologous recombination events and (2) crossing over on chromosome IV in mutants defective in RNaseH1, RNaseH2, or RNaseH1 and RNaseH2. Using a mutant DNA polymerase that incorporates 4-fold fewer rNMPs than wild type, I demonstrate that the primary recombinogenic lesion in the RNaseH2-defective genome is not rNMPs, but rather R-loops. This work suggests different in-vivo roles for RNaseH1 and RNaseH2 in resolving R-loops in yeast and is consistent with R-loops, not rNMPs, being the the likely source of pathology in Aicardi-Goutières Syndrome patients defective in RNaseH2.