33 resultados para Differential Expression Profiling
Resumo:
Eukaryotic genomes are mostly composed of noncoding DNA whose role is still poorly understood. Studies in several organisms have shown correlations between the length of the intergenic and genic sequences of a gene and the expression of its corresponding mRNA transcript. Some studies have found a positive relationship between intergenic sequence length and expression diversity between tissues, and concluded that genes under greater regulatory control require more regulatory information in their intergenic sequences. Other reports found a negative relationship between expression level and gene length and the interpretation was that there is selection pressure for highly expressed genes to remain small. However, a correlation between gene sequence length and expression diversity, opposite to that observed for intergenic sequences, has also been reported, and to date there is no testable explanation for this observation. To shed light on these varied and sometimes conflicting results, we performed a thorough study of the relationships between sequence length and gene expression using cell-type (tissue) specific microarray data in Arabidopsis thaliana. We measured median gene expression across tissues (expression level), expression variability between tissues (expression pattern uniformity), and expression variability between replicates (expression noise). We found that intergenic (upstream and downstream) and genic (coding and noncoding) sequences have generally opposite relationships with respect to expression, whether it is tissue variability, median, or expression noise. To explain these results we propose a model, in which the lengths of the intergenic and genic sequences have opposite effects on the ability of the transcribed region of the gene to be epigenetically regulated for differential expression. These findings could shed light on the role and influence of noncoding sequences on gene expression.
Resumo:
BACKGROUND: Over the past two decades more than fifty thousand unique clinical and biological samples have been assayed using the Affymetrix HG-U133 and HG-U95 GeneChip microarray platforms. This substantial repository has been used extensively to characterize changes in gene expression between biological samples, but has not been previously mined en masse for changes in mRNA processing. We explored the possibility of using HG-U133 microarray data to identify changes in alternative mRNA processing in several available archival datasets. RESULTS: Data from these and other gene expression microarrays can now be mined for changes in transcript isoform abundance using a program described here, SplicerAV. Using in vivo and in vitro breast cancer microarray datasets, SplicerAV was able to perform both gene and isoform specific expression profiling within the same microarray dataset. Our reanalysis of Affymetrix U133 plus 2.0 data generated by in vitro over-expression of HRAS, E2F3, beta-catenin (CTNNB1), SRC, and MYC identified several hundred oncogene-induced mRNA isoform changes, one of which recognized a previously unknown mechanism of EGFR family activation. Using clinical data, SplicerAV predicted 241 isoform changes between low and high grade breast tumors; with changes enriched among genes coding for guanyl-nucleotide exchange factors, metalloprotease inhibitors, and mRNA processing factors. Isoform changes in 15 genes were associated with aggressive cancer across the three breast cancer datasets. CONCLUSIONS: Using SplicerAV, we identified several hundred previously uncharacterized isoform changes induced by in vitro oncogene over-expression and revealed a previously unknown mechanism of EGFR activation in human mammary epithelial cells. We analyzed Affymetrix GeneChip data from over 400 human breast tumors in three independent studies, making this the largest clinical dataset analyzed for en masse changes in alternative mRNA processing. The capacity to detect RNA isoform changes in archival microarray data using SplicerAV allowed us to carry out the first analysis of isoform specific mRNA changes directly associated with cancer survival.
Resumo:
BACKGROUND: Biological processes occur on a vast range of time scales, and many of them occur concurrently. As a result, system-wide measurements of gene expression have the potential to capture many of these processes simultaneously. The challenge however, is to separate these processes and time scales in the data. In many cases the number of processes and their time scales is unknown. This issue is particularly relevant to developmental biologists, who are interested in processes such as growth, segmentation and differentiation, which can all take place simultaneously, but on different time scales. RESULTS: We introduce a flexible and statistically rigorous method for detecting different time scales in time-series gene expression data, by identifying expression patterns that are temporally shifted between replicate datasets. We apply our approach to a Saccharomyces cerevisiae cell-cycle dataset and an Arabidopsis thaliana root developmental dataset. In both datasets our method successfully detects processes operating on several different time scales. Furthermore we show that many of these time scales can be associated with particular biological functions. CONCLUSIONS: The spatiotemporal modules identified by our method suggest the presence of multiple biological processes, acting at distinct time scales in both the Arabidopsis root and yeast. Using similar large-scale expression datasets, the identification of biological processes acting at multiple time scales in many organisms is now possible.
Resumo:
BACKGROUND: Mutations in the TP53 gene are extremely common and occur very early in the progression of serous ovarian cancers. Gene expression patterns that relate to mutational status may provide insight into the etiology and biology of the disease. METHODS: The TP53 coding region was sequenced in 89 frozen serous ovarian cancers, 40 early stage (I/II) and 49 advanced stage (III/IV). Affymetrix U133A expression data was used to define gene expression patterns by mutation, type of mutation, and cancer stage. RESULTS: Missense or chain terminating (null) mutations in TP53 were found in 59/89 (66%) ovarian cancers. Early stage cancers had a significantly higher rate of null mutations than late stage disease (38% vs. 8%, p < 0.03). In advanced stage cases, mutations were more prevalent in short term survivors than long term survivors (81% vs. 30%, p = 0.0004). Gene expression patterns had a robust ability to predict TP53 status within training data. By using early versus late stage disease for out of sample predictions, the signature derived from early stage cancers could accurately (86%) predict mutation status of late stage cancers. CONCLUSIONS: This represents the first attempt to define a genomic signature of TP53 mutation in ovarian cancer. Patterns of gene expression characteristic of TP53 mutation could be discerned and included several genes that are known p53 targets or have been described in the context of expression signatures of TP53 mutation in breast cancer.
Resumo:
BACKGROUND: Previous work has demonstrated the potential for peripheral blood (PB) gene expression profiling for the detection of disease or environmental exposures. METHODS AND FINDINGS: We have sought to determine the impact of several variables on the PB gene expression profile of an environmental exposure, ionizing radiation, and to determine the specificity of the PB signature of radiation versus other genotoxic stresses. Neither genotype differences nor the time of PB sampling caused any lessening of the accuracy of PB signatures to predict radiation exposure, but sex difference did influence the accuracy of the prediction of radiation exposure at the lowest level (50 cGy). A PB signature of sepsis was also generated and both the PB signature of radiation and the PB signature of sepsis were found to be 100% specific at distinguishing irradiated from septic animals. We also identified human PB signatures of radiation exposure and chemotherapy treatment which distinguished irradiated patients and chemotherapy-treated individuals within a heterogeneous population with accuracies of 90% and 81%, respectively. CONCLUSIONS: We conclude that PB gene expression profiles can be identified in mice and humans that are accurate in predicting medical conditions, are specific to each condition and remain highly accurate over time.
Resumo:
BACKGROUND: Since mature erythrocytes are terminally differentiated cells without nuclei and organelles, it is commonly thought that they do not contain nucleic acids. In this study, we have re-examined this issue by analyzing the transcriptome of a purified population of human mature erythrocytes from individuals with normal hemoglobin (HbAA) and homozygous sickle cell disease (HbSS). METHODS AND FINDINGS: Using a combination of microarray analysis, real-time RT-PCR and Northern blots, we found that mature erythrocytes, while lacking ribosomal and large-sized RNAs, contain abundant and diverse microRNAs. MicroRNA expression of erythrocytes was different from that of reticulocytes and leukocytes, and contributed the majority of the microRNA expression in whole blood. When we used microRNA microarrays to analyze erythrocytes from HbAA and HbSS individuals, we noted a dramatic difference in their microRNA expression pattern. We found that miR-320 played an important role for the down-regulation of its target gene, CD71 during reticulocyte terminal differentiation. Further investigation revealed that poor expression of miR-320 in HbSS cells was associated with their defective downregulation CD71 during terminal differentiation. CONCLUSIONS: In summary, we have discovered significant microRNA expression in human mature erythrocytes, which is dramatically altered in HbSS erythrocytes and their defect in terminal differentiation. Thus, the global analysis of microRNA expression in circulating erythrocytes can provide mechanistic insights into the disease phenotypes of erythrocyte diseases.
Resumo:
Hybrid dysfunctions, such as sterility, may result in part from disruptions in the regulation of gene expression. Studies of hybrids within the Drosophila simulans clade have reported genes expressed above or below the expression observed in their parent species, and such misexpression is associated with male sterility in multigenerational backcross hybrids. However, these studies often examined whole bodies rather than testes or had limited replication using less-sensitive but global techniques. Here, we use a new RNA isolation technique to re-examine hybrid gene expression disruptions in both testes and whole bodies from single Drosophila males by real-time quantitative RT-PCR. We find two early-spermatogenesis transcripts are underexpressed in hybrid whole-bodies but not in assays of testes alone, while two late-spermatogenesis transcripts seem to be underexpressed in both whole-bodies and testes alone. Although the number of transcripts surveyed is limited, these results provide some support for a previous hypothesis that the spermatogenesis pathway in these sterile hybrids may be disrupted sometime after the expression of the early meiotic arrest genes.
Resumo:
In the event of a terrorist-mediated attack in the United States using radiological or improvised nuclear weapons, it is expected that hundreds of thousands of people could be exposed to life-threatening levels of ionizing radiation. We have recently shown that genome-wide expression analysis of the peripheral blood (PB) can generate gene expression profiles that can predict radiation exposure and distinguish the dose level of exposure following total body irradiation (TBI). However, in the event a radiation-mass casualty scenario, many victims will have heterogeneous exposure due to partial shielding and it is unknown whether PB gene expression profiles would be useful in predicting the status of partially irradiated individuals. Here, we identified gene expression profiles in the PB that were characteristic of anterior hemibody-, posterior hemibody- and single limb-irradiation at 0.5 Gy, 2 Gy and 10 Gy in C57Bl6 mice. These PB signatures predicted the radiation status of partially irradiated mice with a high level of accuracy (range 79-100%) compared to non-irradiated mice. Interestingly, PB signatures of partial body irradiation were poorly predictive of radiation status by site of injury (range 16-43%), suggesting that the PB molecular response to partial body irradiation was anatomic site specific. Importantly, PB gene signatures generated from TBI-treated mice failed completely to predict the radiation status of partially irradiated animals or non-irradiated controls. These data demonstrate that partial body irradiation, even to a single limb, generates a characteristic PB signature of radiation injury and thus may necessitate the use of multiple signatures, both partial body and total body, to accurately assess the status of an individual exposed to radiation.
Resumo:
There is great interindividual variability in HIV-1 viral setpoint after seroconversion, some of which is known to be due to genetic differences among infected individuals. Here, our focus is on determining, genome-wide, the contribution of variable gene expression to viral control, and to relate it to genomic DNA polymorphism. RNA was extracted from purified CD4+ T-cells from 137 HIV-1 seroconverters, 16 elite controllers, and 3 healthy blood donors. Expression levels of more than 48,000 mRNA transcripts were assessed by the Human-6 v3 Expression BeadChips (Illumina). Genome-wide SNP data was generated from genomic DNA using the HumanHap550 Genotyping BeadChip (Illumina). We observed two distinct profiles with 260 genes differentially expressed depending on HIV-1 viral load. There was significant upregulation of expression of interferon stimulated genes with increasing viral load, including genes of the intrinsic antiretroviral defense. Upon successful antiretroviral treatment, the transcriptome profile of previously viremic individuals reverted to a pattern comparable to that of elite controllers and of uninfected individuals. Genome-wide evaluation of cis-acting SNPs identified genetic variants modulating expression of 190 genes. Those were compared to the genes whose expression was found associated with viral load: expression of one interferon stimulated gene, OAS1, was found to be regulated by a SNP (rs3177979, p = 4.9E-12); however, we could not detect an independent association of the SNP with viral setpoint. Thus, this study represents an attempt to integrate genome-wide SNP signals with genome-wide expression profiles in the search for biological correlates of HIV-1 control. It underscores the paradox of the association between increasing levels of viral load and greater expression of antiviral defense pathways. It also shows that elite controllers do not have a fully distinctive mRNA expression pattern in CD4+ T cells. Overall, changes in global RNA expression reflect responses to viral replication rather than a mechanism that might explain viral control.
Resumo:
Although it has recently been shown that A/J mice are highly susceptible to Staphylococcus aureus sepsis as compared to C57BL/6J, the specific genes responsible for this differential phenotype are unknown. Using chromosome substitution strains (CSS), we found that loci on chromosomes 8, 11, and 18 influence susceptibility to S. aureus sepsis in A/J mice. We then used two candidate gene selection strategies to identify genes on these three chromosomes associated with S. aureus susceptibility, and targeted genes identified by both gene selection strategies. First, we used whole genome transcription profiling to identify 191 (56 on chr. 8, 100 on chr. 11, and 35 on chr. 18) genes on our three chromosomes of interest that are differentially expressed between S. aureus-infected A/J and C57BL/6J. Second, we identified two significant quantitative trait loci (QTL) for survival post-infection on chr. 18 using N(2) backcross mice (F(1) [C18A]xC57BL/6J). Ten genes on chr. 18 (March3, Cep120, Chmp1b, Dcp2, Dtwd2, Isoc1, Lman1, Spire1, Tnfaip8, and Seh1l) mapped to the two significant QTL regions and were also identified by the expression array selection strategy. Using real-time PCR, 6 of these 10 genes (Chmp1b, Dtwd2, Isoc1, Lman1, Tnfaip8, and Seh1l) showed significantly different expression levels between S. aureus-infected A/J and C57BL/6J. For two (Tnfaip8 and Seh1l) of these 6 genes, siRNA-mediated knockdown of gene expression in S. aureus-challenged RAW264.7 macrophages induced significant changes in the cytokine response (IL-1 beta and GM-CSF) compared to negative controls. These cytokine response changes were consistent with those seen in S. aureus-challenged peritoneal macrophages from CSS 18 mice (which contain A/J chromosome 18 but are otherwise C57BL/6J), but not C57BL/6J mice. These findings suggest that two genes, Tnfaip8 and Seh1l, may contribute to susceptibility to S. aureus in A/J mice, and represent promising candidates for human genetic susceptibility studies.
Resumo:
The brain is a highly adaptable organ that is capable of converting sensory information into changes in neuronal function. This plasticity allows behavior to be accommodated to the environment, providing an important evolutionary advantage. Neurons convert environmental stimuli into long-lasting changes in their physiology in part through the synaptic activity-regulated transcription of new gene products. Since the neurotransmitter-dependent regulation of Fos transcription was first discovered nearly 25 years ago, a wealth of studies have enriched our understanding of the molecular pathways that mediate activity-regulated changes in gene transcription. These findings show that a broad range of signaling pathways and transcriptional regulators can be engaged by neuronal activity to sculpt complex programs of stimulus-regulated gene transcription. However, the shear scope of the transcriptional pathways engaged by neuronal activity raises the question of how specificity in the nature of the transcriptional response is achieved in order to encode physiologically relevant responses to divergent stimuli. Here we summarize the general paradigms by which neuronal activity regulates transcription while focusing on the molecular mechanisms that confer differential stimulus-, cell-type-, and developmental-specificity upon activity-regulated programs of neuronal gene transcription. In addition, we preview some of the new technologies that will advance our future understanding of the mechanisms and consequences of activity-regulated gene transcription in the brain.
Resumo:
BACKGROUND: Nonparametric Bayesian techniques have been developed recently to extend the sophistication of factor models, allowing one to infer the number of appropriate factors from the observed data. We consider such techniques for sparse factor analysis, with application to gene-expression data from three virus challenge studies. Particular attention is placed on employing the Beta Process (BP), the Indian Buffet Process (IBP), and related sparseness-promoting techniques to infer a proper number of factors. The posterior density function on the model parameters is computed using Gibbs sampling and variational Bayesian (VB) analysis. RESULTS: Time-evolving gene-expression data are considered for respiratory syncytial virus (RSV), Rhino virus, and influenza, using blood samples from healthy human subjects. These data were acquired in three challenge studies, each executed after receiving institutional review board (IRB) approval from Duke University. Comparisons are made between several alternative means of per-forming nonparametric factor analysis on these data, with comparisons as well to sparse-PCA and Penalized Matrix Decomposition (PMD), closely related non-Bayesian approaches. CONCLUSIONS: Applying the Beta Process to the factor scores, or to the singular values of a pseudo-SVD construction, the proposed algorithms infer the number of factors in gene-expression data. For real data the "true" number of factors is unknown; in our simulations we consider a range of noise variances, and the proposed Bayesian models inferred the number of factors accurately relative to other methods in the literature, such as sparse-PCA and PMD. We have also identified a "pan-viral" factor of importance for each of the three viruses considered in this study. We have identified a set of genes associated with this pan-viral factor, of interest for early detection of such viruses based upon the host response, as quantified via gene-expression data.
Resumo:
Knowing the timing, level, cellular localization, and cell type that a gene is expressed in contributes to our understanding of the function of the gene. Each of these features can be accomplished with in situ hybridization to mRNAs within cells. Here we present a radioactive in situ hybridization method modified from Clayton et al. (1988)(1) that has been working successfully in our lab for many years, especially for adult vertebrate brains(2-5). The long complementary RNA (cRNA) probes to the target sequence allows for detection of low abundance transcripts(6,7). Incorporation of radioactive nucleotides into the cRNA probes allows for further detection sensitivity of low abundance transcripts and quantitative analyses, either by light sensitive x-ray film or emulsion coated over the tissue. These detection methods provide a long-term record of target gene expression. Compared with non-radioactive probe methods, such as DIG-labeling, the radioactive probe hybridization method does not require multiple amplification steps using HRP-antibodies and/or TSA kit to detect low abundance transcripts. Therefore, this method provides a linear relation between signal intensity and targeted mRNA amounts for quantitative analysis. It allows processing 100-200 slides simultaneously. It works well for different developmental stages of embryos. Most developmental studies of gene expression use whole embryos and non-radioactive approaches(8,9), in part because embryonic tissue is more fragile than adult tissue, with less cohesion between cells, making it difficult to see boundaries between cell populations with tissue sections. In contrast, our radioactive approach, due to the larger range of sensitivity, is able to obtain higher contrast in resolution of gene expression between tissue regions, making it easier to see boundaries between populations. Using this method, researchers could reveal the possible significance of a newly identified gene, and further predict the function of the gene of interest.
Resumo:
Spoken language and learned song are complex communication behaviors found in only a few species, including humans and three groups of distantly related birds--songbirds, parrots, and hummingbirds. Despite their large phylogenetic distances, these vocal learners show convergent behaviors and associated brain pathways for vocal communication. However, it is not clear whether this behavioral and anatomical convergence is associated with molecular convergence. Here we used oligo microarrays to screen for genes differentially regulated in brain nuclei necessary for producing learned vocalizations relative to adjacent brain areas that control other behaviors in avian vocal learners versus vocal non-learners. A top candidate gene in our screen was a calcium-binding protein, parvalbumin (PV). In situ hybridization verification revealed that PV was expressed significantly higher throughout the song motor pathway, including brainstem vocal motor neurons relative to the surrounding brain regions of all distantly related avian vocal learners. This differential expression was specific to PV and vocal learners, as it was not found in avian vocal non-learners nor for control genes in learners and non-learners. Similar to the vocal learning birds, higher PV up-regulation was found in the brainstem tongue motor neurons used for speech production in humans relative to a non-human primate, macaques. These results suggest repeated convergent evolution of differential PV up-regulation in the brains of vocal learners separated by more than 65-300 million years from a common ancestor and that the specialized behaviors of learned song and speech may require extra calcium buffering and signaling.
Resumo:
Nutrient availability profoundly influences gene expression. Many animal genes encode multiple transcript isoforms, yet the effect of nutrient availability on transcript isoform expression has not been studied in genome-wide fashion. When Caenorhabditis elegans larvae hatch without food, they arrest development in the first larval stage (L1 arrest). Starved larvae can survive L1 arrest for weeks, but growth and post-embryonic development are rapidly initiated in response to feeding. We used RNA-seq to characterize the transcriptome during L1 arrest and over time after feeding. Twenty-seven percent of detectable protein-coding genes were differentially expressed during recovery from L1 arrest, with the majority of changes initiating within the first hour, demonstrating widespread, acute effects of nutrient availability on gene expression. We used two independent approaches to track expression of individual exons and mRNA isoforms, and we connected changes in expression to functional consequences by mining a variety of databases. These two approaches identified an overlapping set of genes with alternative isoform expression, and they converged on common functional patterns. Genes affecting mRNA splicing and translation are regulated by alternative isoform expression, revealing post-transcriptional consequences of nutrient availability on gene regulation. We also found that phosphorylation sites are often alternatively expressed, revealing a common mode by which alternative isoform expression modifies protein function and signal transduction. Our results detail rich changes in C. elegans gene expression as larvae initiate growth and post-embryonic development, and they provide an excellent resource for ongoing investigation of transcriptional regulation and developmental physiology.