13 resultados para microarray profiling
em DigitalCommons@The Texas Medical Center
Resumo:
The X-linked mouse Rhox gene cluster contains over 30 homeobox genes that are candidates to regulate multiple steps in male and female gametogenesis. The founding member of the Rhox gene cluster, Rhox5, is an androgen-dependent gene expressed in Sertoli cells that promotes the survival and differentiation of the adjacent male germ cells. To decipher downstream signaling pathways of Rhox5, I used in vivo and in vitro microarray profiling to identify and characterize downstream targets of Rhox5 in the testis. This led to the identification of many Rhox5 -regulated genes, two of which I focused on in more detail. One of them, Unc5c, encodes a pro-apoptotic receptor with tumor suppressor activity that I found is negatively regulated by Rhox5 through a Rhox5-response element in the Unc5c 5' untranslated region (5' UTR). Examination of other mouse Rhox family members revealed that Rhox2 and Rhox3 also have the ability to downregulate Unc5c expression. The human RHOX protein RHOXF2 also had this ability, indicating that Unc5c repression is a conserved Rhox-dependent response. The repression of Unc5c expression by Rhox5 may, in part, mediate Rhox5's pro-survival function in the testis, as I found that Unc5c mutant mice have decreased germ cell apoptosis in the testis. This along with my other data leads me to propose a model in which Rhox5 is a negative regulator upstream of Unc5c in a Sertoli-cell pathway that promotes germ-cell survival. The other Rhox5-regulated gene that I studied in detail is insulin II (Ins2). Several lines of evidence, including electrophoretic mobility shift anaylsis, promoter mutagenesis, and chromatin immuoprecipitation analysis indicated that Ins2 is a direct target of Rhox5. Structure-function analysis identified homeodomain residues and the RHOX5 amino-terminal domain crucial for conferring Ins2 inducibility. Rhox5 regulates not only the Ins2 gene but also genes encoding other secreted proteins regulating metabolism (adiponectin and resistin), the rate-liming enzyme for monosaturated fatty acid biosynthesis (SCD-1), and transcription factors crucial for regulating metabolism (the nuclear hormone receptor PPARγ). I propose that the regulation of some or all of these molecules in Sertoli cells is responsible for the Rhox5-dependent survival of the adjacent germ cells. ^
High-resolution microarray analysis of chromosome 20q in human colon cancer metastasis model systems
Resumo:
Amplification of human chromosome 20q DNA is the most frequently occurring chromosomal abnormality detected in sporadic colorectal carcinomas and shows significant correlation with liver metastases. Through comprehensive high-resolution microarray comparative genomic hybridization and microarray gene expression profiling, we have characterized chromosome 20q amplicon genes associated with human colorectal cancer metastasis in two in vitro metastasis model systems. The results revealed increasing complexity of the 20q genomic profile from the primary tumor-derived cell lines to the lymph node and liver metastasis derived cell lines. Expression analysis of chromosome 20q revealed a subset of over expressed genes residing within the regions of genomic copy number gain in all the tumor cell lines, suggesting these are Chromosome 20q copy number responsive genes. Bases on their preferential expression levels in the model system cell lines and known biological function, four of the over expressed genes mapping to the common intervals of genomic copy gain were considered the most promising candidate colorectal metastasis-associated genes. Validation of genomic copy number and expression array data was carried out on these genes, with one gene, DNMT3B, standing out as expressed at a relatively higher levels in the metastasis-derived cell lines compared with their primary-derived counterparts in both the models systems analyzed. The data provide evidence for the role of chromosome 20q genes with low copy gain and elevated expression in the clonal evolution of metastatic cells and suggests that such genes may serve as early biomarkers of metastatic potential. The data also support the utility of the combined microarray comparative genomic hybridization and expression array analysis for identifying copy number responsive genes in areas of low DNA copy gain in cancer cells. ^
Resumo:
Microarray technology is a high-throughput method for genotyping and gene expression profiling. Limited sensitivity and specificity are one of the essential problems for this technology. Most of existing methods of microarray data analysis have an apparent limitation for they merely deal with the numerical part of microarray data and have made little use of gene sequence information. Because it's the gene sequences that precisely define the physical objects being measured by a microarray, it is natural to make the gene sequences an essential part of the data analysis. This dissertation focused on the development of free energy models to integrate sequence information in microarray data analysis. The models were used to characterize the mechanism of hybridization on microarrays and enhance sensitivity and specificity of microarray measurements. ^ Cross-hybridization is a major obstacle factor for the sensitivity and specificity of microarray measurements. In this dissertation, we evaluated the scope of cross-hybridization problem on short-oligo microarrays. The results showed that cross hybridization on arrays is mostly caused by oligo fragments with a run of 10 to 16 nucleotides complementary to the probes. Furthermore, a free-energy based model was proposed to quantify the amount of cross-hybridization signal on each probe. This model treats cross-hybridization as an integral effect of the interactions between a probe and various off-target oligo fragments. Using public spike-in datasets, the model showed high accuracy in predicting the cross-hybridization signals on those probes whose intended targets are absent in the sample. ^ Several prospective models were proposed to improve Positional Dependent Nearest-Neighbor (PDNN) model for better quantification of gene expression and cross-hybridization. ^ The problem addressed in this dissertation is fundamental to the microarray technology. We expect that this study will help us to understand the detailed mechanism that determines sensitivity and specificity on the microarrays. Consequently, this research will have a wide impact on how microarrays are designed and how the data are interpreted. ^
Resumo:
Embryonic stem cells (ESCs) possess two unique characteristics: infinite self-renewal and the potential to differentiate into almost every cell type (pluripotency). Recently, global expression analyses of metastatic breast and lung cancers revealed an ESC-like expression program or signature, specifically for cancers that are mutant for p53 function. Surprisingly, although p53 is widely recognized as the guardian of the genome, due to its roles in cell cycle checkpoints, programmed cell death or senescence, relatively little is known about p53 functions in normal cells, especially in ESCs. My hypothesis is that p53 has specific transcription regulatory functions in human ESCs (hESCs) that a) oppose pluripotency and b) protect the stem cell genome in response to DNA damage and stress signaling. In mouse ESCs, these roles are believed to coincide, as p53 promotes differentiation in response to DNA damage, but this is unexplored in hESCs. To determine the biological roles of p53, specifically in hESCs, we mapped genome-wide chromatin interactions of p53 by chromatin immunoprecipitation and massively parallel tag sequencing (ChIP-Seq), and did so under three VIdifferent conditions of hESC status: pluripotency, differentiation-initiated and DNA-damage-induced. ChIP-Seq showed that p53 is enriched at distinct, induction-specific gene loci during each of these different conditions. Microarray gene expression analysis and functional annotation of the distinct p53-target genes revealed that p53 regulates specific genes encoding developmental regulators, which are expressed in differentiation-initiated but not DNA- damaged hESCs. We further discovered that, in response to differentiation signaling, p53 binds regions of chromatin that are repressed but also poised for rapid activation by core pluripotency factors OCT4 and NANOG in pluripotent hESCs. In response to DNA damage, genes associated with migration and motility are targeted by p53; whereas, the prime targets of p53 in control of cell death are conserved for p53 regulation in both differentiation and DNA damage. Our genome-wide profiling and bioinformatics analyses show that p53 occupies a special set of developmental regulatory genes during early differentiation of hESCs and functions in an induction-specific manner. In conclusion, our research unveiled previously unknown functions of p53 in ESC biology, which augments our understanding of one of the most deregulated proteins in human cancers.
Resumo:
Brain tumor is one of the most aggressive types of cancer in humans, with an estimated median survival time of 12 months and only 4% of the patients surviving more than 5 years after disease diagnosis. Until recently, brain tumor prognosis has been based only on clinical information such as tumor grade and patient age, but there are reports indicating that molecular profiling of gliomas can reveal subgroups of patients with distinct survival rates. We hypothesize that coupling molecular profiling of brain tumors with clinical information might improve predictions of patient survival time and, consequently, better guide future treatment decisions. In order to evaluate this hypothesis, the general goal of this research is to build models for survival prediction of glioma patients using DNA molecular profiles (U133 Affymetrix gene expression microarrays) along with clinical information. First, a predictive Random Forest model is built for binary outcomes (i.e. short vs. long-term survival) and a small subset of genes whose expression values can be used to predict survival time is selected. Following, a new statistical methodology is developed for predicting time-to-death outcomes using Bayesian ensemble trees. Due to a large heterogeneity observed within prognostic classes obtained by the Random Forest model, prediction can be improved by relating time-to-death with gene expression profile directly. We propose a Bayesian ensemble model for survival prediction which is appropriate for high-dimensional data such as gene expression data. Our approach is based on the ensemble "sum-of-trees" model which is flexible to incorporate additive and interaction effects between genes. We specify a fully Bayesian hierarchical approach and illustrate our methodology for the CPH, Weibull, and AFT survival models. We overcome the lack of conjugacy using a latent variable formulation to model the covariate effects which decreases computation time for model fitting. Also, our proposed models provides a model-free way to select important predictive prognostic markers based on controlling false discovery rates. We compare the performance of our methods with baseline reference survival methods and apply our methodology to an unpublished data set of brain tumor survival times and gene expression data, selecting genes potentially related to the development of the disease under study. A closing discussion compares results obtained by Random Forest and Bayesian ensemble methods under the biological/clinical perspectives and highlights the statistical advantages and disadvantages of the new methodology in the context of DNA microarray data analysis.
Resumo:
BACKGROUND: We previously identified ebpR, encoding a potential member of the AtxA/Mga transcriptional regulator family, and showed that it is important for transcriptional activation of the Enterococcus faecalis endocarditis and biofilm associated pilus operon, ebpABC. Although ebpR is not absolutely essential for ebpABC expression (100-fold reduction), its deletion led to phenotypes similar to those of an ebpABC mutant such as absence of pili at the cell surface and, consequently, reduced biofilm formation. A non-piliated ebpABC mutant has been shown to be attenuated in a rat model of endocarditis and in a murine urinary tract infection model, indicating an important participation of the ebpR-ebpABC locus in virulence. However, there is no report relating to the environmental conditions that affect expression of the ebpR-ebpABC locus. RESULTS: In this study, we examined the effect of CO2/HCO3(-), pH, and the Fsr system on the ebpR-ebpABC locus expression. The presence of 5% CO2/0.1 M HCO3(-) increased ebpR-ebpABC expression, while the Fsr system was confirmed to be a weak repressor of this locus. The mechanism by which the Fsr system repressed the ebpR-ebpABC locus expression appears independent of the effects of CO2(-) bicarbonate. Furthermore, by using an ebpA::lacZ fusion as a reporter, we showed that addition of 0.1 M sodium bicarbonate to TSBG (buffered at pH 7.5), but not the presence of 5% CO2, induced ebpA expression in TSBG broth. In addition, using microarray analysis, we found 73 genes affected by the presence of sodium bicarbonate (abs(fold) > 2, P < 0.05), the majority of which belong to the PTS system and ABC transporter families. Finally, pilus production correlated with ebpA mRNA levels under the conditions tested. CONCLUSIONS: This study reports that the ebp locus expression is enhanced by the presence of bicarbonate with a consequential increase in the number of cells producing pili. Although the molecular basis of the bicarbonate effect remains unclear, the pathway is independent of the Fsr system. In conclusion, E. faecalis joins the growing family of pathogens that regulates virulence gene expression in response to bicarbonate and/or CO2.
Resumo:
PURPOSE: The present study defines genomic loci underlying coordinate changes in gene expression following retinal injury. METHODS: A group of acute phase genes expressed in diverse nervous system tissues was defined by combining microarray results from injury studies from rat retina, brain, and spinal cord. Genomic loci regulating the brain expression of acute phase genes were identified using a panel of BXD recombinant inbred (RI) mouse strains. Candidate upstream regulators within a locus were defined using single nucleotide polymorphism databases and promoter motif databases. RESULTS: The acute phase response of rat retina, brain, and spinal cord was dominated by transcription factors. Three genomic loci control transcript expression of acute phase genes in brains of BXD RI mouse strains. One locus was identified on chromosome 12 and was highly correlated with the expression of classic acute phase genes. Within the locus we identified the inhibitor of DNA binding 2 (Id2) as a candidate upstream regulator. Id2 was upregulated as an acute phase transcript in injury models of rat retina, brain, and spinal cord. CONCLUSIONS: We defined a group of transcriptional changes associated with the retinal acute injury response. Using genetic linkage analysis of natural transcript variation, we identified regulatory loci and candidate regulators that control transcript levels of acute phase genes.
Resumo:
In chronic lymphocytic leukemia (CLL), one of the best predictors of outcome is the somatic mutation status of the immunoglobulin heavy chain variable region (IGHV) genes. Patients whose CLL cells have unmutated IGHV genes have a median survival of 8 years; those with mutated IGHV genes have a median survival of 25 years. To identify new prognostic biomarkers and molecular targets for therapy in untreated CLL patients, we reanalyzed the raw data from four published gene expression profiling microarray studies. Of 88 candidate biomarkers associated with IGHV somatic mutation status, we identified LDOC1 (Leucine Zipper, Down-regulated in Cancer 1), as one of the most significantly differentially expressed genes that distinguished mutated from unmutated CLL cases. LDOC1 is a putative transcription factor of unknown function in B-cell development and CLL pathophysiology. Using a highly sensitive quantitative RT-PCR (QRT-PCR) assay, we confirmed that LDOC1 mRNA was dramatically down-regulated in mutated compared to unmutated CLL cases. Expression of LDOC1 mRNA was also vii strongly associated with other markers of poor prognosis, including ZAP70 protein and cytogenetic abnormalities of poor prognosis (deletions of chromosomes 6q21, 11q23, and 17p13.1, and trisomy 12). CLL cases positive for LDOC1 mRNA had significantly shorter overall survival than negative cases. Moreover, in a multivariate model, LDOC1 mRNA expression predicted overall survival better than IGHV mutation status or ZAP70 protein, among the best markers of prognosis in CLL. We also discovered LDOC1S, a new LDOC1 splice variant. Using isoform-specific QRT-PCR assays that we developed, we found that both isoforms were expressed in normal B cells (naïve > memory), unmutated CLL cells, and in B-cell non-Hodgkin lymphomas with unmutated IGHV genes. To investigate pathways in which LDOC1 is involved, we knocked down LDOC1 in HeLa cells and performed global gene expression profiling. GFI1 (Growth Factor-Independent 1) emerged as a significantly up-regulated gene in both HeLa cells and CLL cells that expressed high levels of LDOC1. GFI1 oncoprotein is implicated in hematopoietic stem cell maintenance, lymphocyte development, and lymphomagenesis. Our findings indicate that LDOC1 mRNA is an excellent biomarker of overall survival in CLL, and may contribute to B-cell differentiation and malignant transformation.
Resumo:
Improvements in the analysis of microarray images are critical for accurately quantifying gene expression levels. The acquisition of accurate spot intensities directly influences the results and interpretation of statistical analyses. This dissertation discusses the implementation of a novel approach to the analysis of cDNA microarray images. We use a stellar photometric model, the Moffat function, to quantify microarray spots from nylon microarray images. The inherent flexibility of the Moffat shape model makes it ideal for quantifying microarray spots. We apply our novel approach to a Wilms' tumor microarray study and compare our results with a fixed-circle segmentation approach for spot quantification. Our results suggest that different spot feature extraction methods can have an impact on the ability of statistical methods to identify differentially expressed genes. We also used the Moffat function to simulate a series of microarray images under various experimental conditions. These simulations were used to validate the performance of various statistical methods for identifying differentially expressed genes. Our simulation results indicate that tests taking into account the dependency between mean spot intensity and variance estimation, such as the smoothened t-test, can better identify differentially expressed genes, especially when the number of replicates and mean fold change are low. The analysis of the simulations also showed that overall, a rank sum test (Mann-Whitney) performed well at identifying differentially expressed genes. Previous work has suggested the strengths of nonparametric approaches for identifying differentially expressed genes. We also show that multivariate approaches, such as hierarchical and k-means cluster analysis along with principal components analysis, are only effective at classifying samples when replicate numbers and mean fold change are high. Finally, we show how our stellar shape model approach can be extended to the analysis of 2D-gel images by adapting the Moffat function to take into account the elliptical nature of spots in such images. Our results indicate that stellar shape models offer a previously unexplored approach for the quantification of 2D-gel spots. ^
Resumo:
Most studies of p53 function have focused on genes transactivated by p53. It is less widely appreciated that p53 can repress target genes to affect a particular cellular response. There is evidence that repression is important for p53-induced apoptosis and cell cycle arrest. It is less clear if repression is important for other p53 functions. A comprehensive knowledge of the genes repressed by p53 and the cellular processes they affect is currently lacking. We used an expression profiling strategy to identify p53-responsive genes following adenoviral p53 gene transfer (Ad-p53) in PC3 prostate cancer cells. A total of 111 genes represented on the Affymetrix U133A microarray were repressed more than two fold (p ≤ 0.05) by p53. An objective assessment of array data quality was carried out using RT-PCR of 20 randomly selected genes. We estimate a confirmation rate of >95.5% for the complete data set. Functional over-representation analysis was used to identify cellular processes potentially affected by p53-mediated repression. Cell cycle regulatory genes exhibited significant enrichment (p ≤ 5E-28) within the repressed targets. Several of these genes are repressed in a p53-dependent manner following DNA damage, but preceding cell cycle arrest. These findings identify novel p53-repressed targets and indicate that p53-induced cell cycle arrest is a function of not only the transactivation of cell cycle inhibitors (e.g., p21), but also the repression of targets that act at each phase of the cell cycle. The mechanism of repression of this set of p53 targets was investigated. Most of the repressed genes identified here do not harbor consensus p53 DNA binding sites but do contain binding sites for E2F transcription factors. We demonstrate a role for E2F/RB repressor complexes in our system. Importantly, p53 is found at the promoter of CDC25A. CDC25A protein is rapidly degraded in response to DNA damage. Our group has demonstrated for the first time that CDC25A is also repressed at the transcript level by p53. This work has important implications for understanding the DNA damage cell cycle checkpoint response and the link between E2F/RB complexes and p53 in the repression of target genes. ^
Resumo:
Chromatin, composed of repeating nucleosome units, is the genetic polymer of life. To aid in DNA compaction and organized storage, the double helix wraps around a core complex of histone proteins to form the nucleosome, and is therefore no longer freely accessible to cellular proteins for the processes of transcription, replication and DNA repair. Over the course of evolution, DNA-based applications have developed routes to access DNA bound up in chromatin, and further, have actually utilized the chromatin structure to create another level of complexity and information storage. The histone molecules that DNA surrounds have free-floating tails that extend out of the nucleosome. These tails are post-translationally modified to create docking sites for the proteins involved in transcription, replication and repair, thus providing one prominent way that specific genomic sequences are accessed and manipulated. Adding another degree of information storage, histone tail-modifications paint the genome in precise manners to influence a state of transcriptional activity or repression, to generate euchromatin, containing gene-dense regions, or heterochromatin, containing repeat sequences and low-density gene regions. The work presented here is the study of histone tail modifications, how they are written and how they are read, divided into two projects. Both begin with protein microarray experiments where we discover the protein domains that can bind modified histone tails, and how multiple tail modifications can influence this binding. Project one then looks deeper into the enzymes that lay down the tail modifications. Specifically, we studied histone-tail arginine methylation by PRMT6. We found that methylation of a specific histone residue by PRMT6, arginine 2 of H3, can antagonize the binding of protein domains to the H3 tail and therefore affect transcription of genes regulated by the H3-tail binding proteins. Project two focuses on a protein we identified to bind modified histone tails, PHF20, and was an endeavor to discover the biological role of this protein. Thus, in total, we are looking at a complete process: (1) histone tail modification by an enzyme (here, PRMT6), (2) how this and other modifications are bound by conserved protein domains, and (3) by using PHF20 as an example, the functional outcome of binding through investigating the biological role of a chromatin reader. ^
Resumo:
The difficulty of detecting differential gene expression in microarray data has existed for many years. Several correction procedures try to avoid the family-wise error rate in multiple comparison process, including the Bonferroni and Sidak single-step p-value adjustments, Holm's step-down correction method, and Benjamini and Hochberg's false discovery rate (FDR) correction procedure. Each multiple comparison technique has its advantages and weaknesses. We studied each multiple comparison method through numerical studies (simulations) and applied the methods to the real exploratory DNA microarray data, which detect of molecular signatures in papillary thyroid cancer (PTC) patients. According to our results of simulation studies, Benjamini and Hochberg step-up FDR controlling procedure is the best process among these multiple comparison methods and we discovered 1277 potential biomarkers among 54675 probe sets after applying the Benjamini and Hochberg's method to PTC microarray data.^
Resumo:
Most studies of differential gene-expressions have been conducted between two given conditions. The two-condition experimental (TCE) approach is simple in that all genes detected display a common differential expression pattern responsive to a common two-condition difference. Therefore, the genes that are differentially expressed under the other conditions other than the given two conditions are undetectable with the TCE approach. In order to address the problem, we propose a new approach called multiple-condition experiment (MCE) without replication and develop corresponding statistical methods including inference of pairs of conditions for genes, new t-statistics, and a generalized multiple-testing method for any multiple-testing procedure via a control parameter C. We applied these statistical methods to analyze our real MCE data from breast cancer cell lines and found that 85 percent of gene-expression variations were caused by genotypic effects and genotype-ANAX1 overexpression interactions, which agrees well with our expected results. We also applied our methods to the adenoma dataset of Notterman et al. and identified 93 differentially expressed genes that could not be found in TCE. The MCE approach is a conceptual breakthrough in many aspects: (a) many conditions of interests can be conducted simultaneously; (b) study of association between differential expressions of genes and conditions becomes easy; (c) it can provide more precise information for molecular classification and diagnosis of tumors; (d) it can save lot of experimental resources and time for investigators.^