Most studies of differential gene-expressions have been conducted between two given conditions. The two-condition experimental (TCE) approach is simple in that all genes detected display a common differential expression pattern responsive to a common two-condition difference. Therefore, the genes that are differentially expressed under the other conditions other than the given two conditions are undetectable with the TCE approach. In order to address the problem, we propose a new approach called multiple-condition experiment (MCE) without replication and develop corresponding statistical methods including inference of pairs of conditions for genes, new t-statistics, and a generalized multiple-testing method for any multiple-testing procedure via a control parameter C. We applied these statistical methods to analyze our real MCE data from breast cancer cell lines and found that 85 percent of gene-expression variations were caused by genotypic effects and genotype-ANAX1 overexpression interactions, which agrees well with our expected results. We also applied our methods to the adenoma dataset of Notterman et al. and identified 93 differentially expressed genes that could not be found in TCE. The MCE approach is a conceptual breakthrough in many aspects: (a) many conditions of interests can be conducted simultaneously; (b) study of association between differential expressions of genes and conditions becomes easy; (c) it can provide more precise information for molecular classification and diagnosis of tumors; (d) it can save lot of experimental resources and time for investigators.^


Studies have demonstrated a variable response to ozone among individuals and animal species and strains. For instance, C57BL/6J mice have a greater inflammatory response to ozone exposure than C3H/HeJ mice. In these studies, I utilized these strain differences in an effort to derive a mechanistic explanation to the variable strain sensitivity to ozone exposure. Therefore, alveolar macrophages (AM) from C57BL/6J and C3H/HeJ mice were exposed in vitro to hydrogen peroxide ($\rm H\sb2O\sb2$), heat and acetyl ceramide or in vivo to ozone. Necrosis and DNA fragmentation in macrophages from the two murine strains were determined to assess cytotoxicity following these treatments. In addition, synthesis and expression of the stress proteins, stress protein 72 (SP72) and heme oxygenase (HO-1), were examined following treatments. The in vitro experiments were conducted to eliminate the possibility of in vivo confounders (i.e., differences in breathing rates in the two strains) and thus directly implicate some inherent difference between cells from the two murine strains. $\rm H\sb2O\sb2$ and heat caused greater cytotoxicity in AM from C57BL/6J than C3H/HeJ mice and DNA fragmentation was a particularly sensitive indicator of cell injury. Similarly, AM from C57BL/6J mice were more sensitive to ozone exposure than cells from C3H/HeJ mice. Exposure to either 1 or 0.4 ppm ozone caused greater cytotoxicity in macrophages from C57BL/6J mice compared to macrophages from C3H/HeJ mice. The increased sensitivity of AM to injury was associated with decreased synthesis and expression of stress proteins. AM from C57BL/6J mice synthesized and expressed significantly less stress proteins in response to heat and ozone than AM from C3H/HeJ mice. Heat treatment resulted in greater synthesis and expression of SP72. In addition, macrophages from C57BL/6J mice expressed lower amounts of HO-1 than macrophages from C3H/HeJ mice following 0.4 ppm ozone exposure. Therefore, AM from C57BL/6J mice are more susceptible to oxidative injury than AM from C3H/HeJ mice which might be due to differential expression of stress proteins in these cells. ^


My dissertation focuses on two aspects of RNA sequencing technology. The first is the methodology for modeling the overdispersion inherent in RNA-seq data for differential expression analysis. This aspect is addressed in three sections. The second aspect is the application of RNA-seq data to identify the CpG island methylator phenotype (CIMP) by integrating datasets of mRNA expression level and DNA methylation status. Section 1: The cost of DNA sequencing has reduced dramatically in the past decade. Consequently, genomic research increasingly depends on sequencing technology. However it remains elusive how the sequencing capacity influences the accuracy of mRNA expression measurement. We observe that accuracy improves along with the increasing sequencing depth. To model the overdispersion, we use the beta-binomial distribution with a new parameter indicating the dependency between overdispersion and sequencing depth. Our modified beta-binomial model performs better than the binomial or the pure beta-binomial model with a lower false discovery rate. Section 2: Although a number of methods have been proposed in order to accurately analyze differential RNA expression on the gene level, modeling on the base pair level is required. Here, we find that the overdispersion rate decreases as the sequencing depth increases on the base pair level. Also, we propose four models and compare them with each other. As expected, our beta binomial model with a dynamic overdispersion rate is shown to be superior. Section 3: We investigate biases in RNA-seq by exploring the measurement of the external control, spike-in RNA. This study is based on two datasets with spike-in controls obtained from a recent study. We observe an undiscovered bias in the measurement of the spike-in transcripts that arises from the influence of the sample transcripts in RNA-seq. Also, we find that this influence is related to the local sequence of the random hexamer that is used in priming. We suggest a model of the inequality between samples and to correct this type of bias. Section 4: The expression of a gene can be turned off when its promoter is highly methylated. Several studies have reported that a clear threshold effect exists in gene silencing that is mediated by DNA methylation. It is reasonable to assume the thresholds are specific for each gene. It is also intriguing to investigate genes that are largely controlled by DNA methylation. These genes are called “L-shaped” genes. We develop a method to determine the DNA methylation threshold and identify a new CIMP of BRCA. In conclusion, we provide a detailed understanding of the relationship between the overdispersion rate and sequencing depth. And we reveal a new bias in RNA-seq and provide a detailed understanding of the relationship between this new bias and the local sequence. Also we develop a powerful method to dichotomize methylation status and consequently we identify a new CIMP of breast cancer with a distinct classification of molecular characteristics and clinical features.


The neu oncogene encodes a growth factor receptor-like protein, p185, with an intrinsic tyrosine kinase activity. A single point mutation, an A to T transversion resulting in an amino acid substitution from valine to glutamic acid, in the transmembrane domain of the rat neu gene was found to be responsible for the transforming and tumorigenic phenotype of the cells that carry it. In contrast, the human proto-neu oncogene is frequently amplified in tumors and cell lines derived from tumors and the human neu gene overexpression/amplification in breast and ovarian cancers is known to correlate with poor patient prognosis. Examples of the human neu gene overexpression in the absence of gene amplification have been observed, which may suggest the significant role of the transcriptional and/or post-transcriptional control of the neu gene in the oncogenic process. However, little is known about the transcriptional mechanisms which regulate the neu gene expression. In this study, three examples are presented to demonstrate the positive and negative control of the neu gene expression.^ First, by using band shift assays and methylation interference analyses, I have identified a specific protein-binding sequence, AAGATAAAACC ($-$466 to $-$456), that binds a specific trans-acting factor termed RVF (for EcoRV factor on the neu promoter). The RVF-binding site is required for maximum transcriptional activity of the rat neu promoter. This same sequence is also found in the corresponding regions of both human and mouse neu promoters. Furthermore, this sequence can enhance the CAT activity driven by a minimum promoter of the thymidine kinase gene in an orientation-independent manner, and thus it behaves as an enhancer. In addition, Southwestern (DNA-protein) blot analysis using the RVF-binding site as a probe points to a 60-kDa polypeptide as a potential candidate for RVF.^ Second, it has been reported that the E3 region of adenovirus 5 induces down-regulation of epidermal growth factor (EGF) receptor through endocytosis. I found that the human neu gene product, p185, (an EGF receptor-related protein) is also down-regulated by adenovirus 5, but via a different mechanism. I demonstrate that the adenovirus E1a gene is responsible for the repression of the human neu gene at the transcriptional level.^ Third, a differential expression of the neu gene has been found in two cell model systems: between the mouse fibroblast Swiss-Webster 3T3 (SW3T3) and its variant NR-6 cells; and between the mouse liver tumor cell line, Hep1-a, and the mouse pancreas tumor cell line, 266-6. Both NR-6 and 266-6 cell lines are not able to express the neu gene product, p185. I demonstrate that, in both cases, the transcriptional repression of the neu gene may account for the lack of the p185 expression in these two cell lines. ^


Understanding the effects of the external environment on bacterial gene expression can provide valuable insights into an array of cellular mechanisms including pathogenesis, drug resistance, and, in the case of Mycobacterium tuberculosis, latency. Because of the absence of poly(A)+ mRNA in prokaryotic organisms, studies of differential gene expression currently must be performed either with large amounts of total RNA or rely on amplification techniques that can alter the proportional representation of individual mRNA sequences. We have developed an approach to study differences in bacterial mRNA expression that enables amplification by the PCR of a complex mixture of cDNA sequences in a reproducible manner that obviates the confounding effects of selected highly expressed sequences, e.g., ribosomal RNA. Differential expression using customized amplification libraries (DECAL) uses a library of amplifiable genomic sequences to convert total cellular RNA into an amplified probe for gene expression screens. DECAL can detect 4-fold differences in the mRNA levels of rare sequences and can be performed on as little as 10 ng of total RNA. DECAL was used to investigate the in vitro effect of the antibiotic isoniazid on M. tuberculosis, and three previously uncharacterized isoniazid-induced genes, iniA, iniB, and iniC, were identified. The iniB gene has homology to cell wall proteins, and iniA contains a phosphopantetheine attachment site motif suggestive of an acyl carrier protein. The iniA gene is also induced by the antibiotic ethambutol, an agent that inhibits cell wall biosynthesis by a mechanism that is distinct from isoniazid. The DECAL method offers a powerful new tool for the study of differential gene expression.


Tissue factor (TF) is the cellular receptor for an activated form of clotting factor VII (VIIa) and the binding of factor VII(a) to TF initiates the coagulation cascade. Sequence and structural patterns extracted from a global alignment of TF confers homology with interferon receptors of the cytokine receptor super family. Several recent studies suggested that TF could function as a genuine signal transducing receptor. However, it is unknown which biological function(s) of cells are altered upon the ligand, VIIa, binding to TF. In the present study, we examined the effect of VIIa binding to cell surface TF on cellular gene expression in fibroblasts. Differential mRNA display PCR technique was used to identify transcriptional changes in fibroblasts upon VIIa binding to TF. The display showed that VIIa binding to TF either up or down-regulated several mRNA species. The differential expression of one such transcript, VIIa-induced up-regulation, was confirmed by Northern blot analysis. Isolation of a full-length cDNA corresponding to the differentially expressed transcript revealed that VIIa-up-regulated gene was poly(A) polymerase. Northern blot analysis of various carcinomas and normal human tissues revealed an over expression of PAP in cancer tissues. Enhanced expression of PAP upon VIIa binding to tumor cell TF may potentially play an important role in tumor metastasis.


Electrical coupling by gap junctions is an important form of cell-to-cell communication in early brain development. Whereas glial cells remain electrically coupled at postnatal stages, adult vertebrate neurons were thought to communicate mainly via chemical synapses. There is now accumulating evidence that in certain neuronal cell populations the capacity for electrical signaling by gap junction channels is still present in the adult. Here we identified electrically coupled pairs of neurons between postnatal days 12 and 18 in rat visual cortex, somatosensory cortex, and hippocampus. Notably, coupling was found both between pairs of inhibitory neurons and between inhibitory and excitatory neurons. Molecular analysis by single-cell reverse transcription–PCR revealed a differential expression pattern of connexins in these identified neurons.


The cDNA microarray is one technological approach that has the potential to accurately measure changes in global mRNA expression levels. We report an assessment of an optimized cDNA microarray platform to generate accurate, precise and reliable data consistent with the objective of using microarrays as an acquisition platform to populate gene expression databases. The study design consisted of two independent evaluations with 70 arrays from two different manufactured lots and used three human tissue sources as samples: placenta, brain and heart. Overall signal response was linear over three orders of magnitude and the sensitivity for any element was estimated to be 2 pg mRNA. The calculated coefficient of variation for differential expression for all non-differentiated elements was 12–14% across the entire signal range and did not vary with array batch or tissue source. The minimum detectable fold change for differential expression was 1.4. Accuracy, in terms of bias (observed minus expected differential expression ratio), was less than 1 part in 10 000 for all non-differentiated elements. The results presented in this report demonstrate the reproducible performance of the cDNA microarray technology platform and the methods provide a useful framework for evaluating other technologies that monitor changes in global mRNA expression.


Neuropathological and brain imaging studies suggest that schizophrenia may result from neurodevelopmental defects. Cytoarchitectural studies indicate cellular abnormalities suggestive of a disruption in neuronal connectivity in schizophrenia, particularly in the dorsolateral prefrontal cortex. Yet, the molecular mechanisms underlying these findings remain unclear. To identify molecular substrates associated with schizophrenia, DNA microarray analysis was used to assay gene expression levels in postmortem dorsolateral prefrontal cortex of schizophrenic and control patients. Genes determined to have altered expression levels in schizophrenics relative to controls are involved in a number of biological processes, including synaptic plasticity, neuronal development, neurotransmission, and signal transduction. Most notable was the differential expression of myelination-related genes suggesting a disruption in oligodendrocyte function in schizophrenia.


Liver-specific and nonliver-specific methionine adenosyltransferases (MATs) are products of two genes, MAT1A and MAT2A, respectively, that catalyze the formation of S-adenosylmethionine (AdoMet), the principal biological methyl donor. Mature liver expresses MAT1A, whereas MAT2A is expressed in extrahepatic tissues and is induced during liver growth and dedifferentiation. To examine the influence of MAT1A on hepatic growth, we studied the effects of a targeted disruption of the murine MAT1A gene. MAT1A mRNA and protein levels were absent in homozygous knockout mice. At 3 months, plasma methionine level increased 776% in knockouts. Hepatic AdoMet and glutathione levels were reduced by 74 and 40%, respectively, whereas S-adenosylhomocysteine, methylthioadenosine, and global DNA methylation were unchanged. The body weight of 3-month-old knockout mice was unchanged from wild-type littermates, but the liver weight was increased 40%. The Affymetrix genechip system and Northern and Western blot analyses were used to analyze differential expression of genes. The expression of many acute phase-response and inflammatory markers, including orosomucoid, amyloid, metallothionein, Fas antigen, and growth-related genes, including early growth response 1 and proliferating cell nuclear antigen, is increased in the knockout animal. At 3 months, knockout mice are more susceptible to choline-deficient diet-induced fatty liver. At 8 months, knockout mice developed spontaneous macrovesicular steatosis and predominantly periportal mononuclear cell infiltration. Thus, absence of MAT1A resulted in a liver that is more susceptible to injury, expresses markers of an acute phase response, and displays increased proliferation.


This work illustrates potential adverse effects linked with the expression of proteinase inhibitor (PI) in plants used as a strategy to enhance pest resistance. Tobacco (Nicotiana tabacum L. cv Xanthi) and Arabidopsis [Heynh.] ecotype Wassilewskija) transgenic plants expressing the mustard trypsin PI 2 (MTI-2) at different levels were obtained. First-instar larvae of the Egyptian cotton worm (Spodoptera littoralis Boisd.) were fed on detached leaves of these plants. The high level of MTI-2 expression in leaves had deleterious effects on larvae, causing mortality and decreasing mean larval weight, and was correlated with a decrease in the leaf surface eaten. However, larvae fed leaves from plants expressing MTI-2 at the low expression level did not show increased mortality, but a net gain in weight and a faster development compared with control larvae. The low MTI-2 expression level also resulted in increased leaf damage. These observations are correlated with the differential expression of digestive proteinases in the larval gut; overexpression of existing proteinases on low-MTI-2-expression level plants and induction of new proteinases on high-MTI-2-expression level plants. These results emphasize the critical need for the development of a PI-based defense strategy for plants obtaining the appropriate PI-expression level relative to the pest's sensitivity threshold to that PI.


Microarrays containing 1046 human cDNAs of unknown sequence were printed on glass with high-speed robotics. These 1.0-cm2 DNA "chips" were used to quantitatively monitor differential expression of the cognate human genes using a highly sensitive two-color hybridization assay. Array elements that displayed differential expression patterns under given experimental conditions were characterized by sequencing. The identification of known and novel heat shock and phorbol ester-regulated genes in human T cells demonstrates the sensitivity of the assay. Parallel gene analysis with microarrays provides a rapid and efficient method for large-scale human gene discovery.


Parasite proteases play key roles in several fundamental steps of the Plasmodium life cycle, including haemoglobin degradation, host cell invasion and parasite egress. Plasmodium exit from infected host cells appears to be mediated by a class of papain-like cysteine proteases called 'serine repeat antigens' (SERAs). A SERA subfamily, represented by Plasmodium falciparum SERA5, contains an atypical active site serine residue instead of a catalytic cysteine. Members of this SERAser subfamily are abundantly expressed in asexual blood stages, rendering them attractive drug and vaccine targets. In this study, we show by antibody localization and in vivo fluorescent tagging with the red fluorescent protein mCherry that the two P. berghei serine-type family members, PbSERA1 and PbSERA2, display differential expression towards the final stages of merozoite formation. Via targeted gene replacement, we generated single and double gene knockouts of the P. berghei SERAser genes. These loss-of-function lines progressed normally through the parasite life cycle, suggesting a specialized, non-vital role for serine-type SERAs in vivo. Parasites lacking PbSERAser showed increased expression of the cysteine-type PbSERA3. Compensatory mechanisms between distinct SERA subfamilies may thus explain the absence of phenotypical defect in SERAser disruptants, and challenge the suitability to develop potent antimalarial drugs based on specific inhibitors of Plasmodium serine-type SERAs.


We have constructed cDNA microarrays for soybean (Glycine max L. Merrill), containing approximately 4,100 Unigene ESTs derived from axenic roots, to evaluate their application and utility for functional genomics of organ differentiation in legumes. We assessed microarray technology by conducting studies to evaluate the accuracy of microarray data and have found them to be both reliable and reproducible in repeat hybridisations. Several ESTs showed high levels (>50 fold) of differential expression in either root or shoot tissue of soybean. A small number of physiologically interesting, and differentially expressed sequences found by microarray analysis were verified by both quantitative real-time RT-PCR and Northern blot analysis. There was a linear correlation (r(2) = 0.99, over 5 orders of magnitude) between microarray and quantitative real-time RT-PCR data. Microarray analysis of soybean has enormous potential not only for the discovery of new genes involved in tissue differentiation and function, but also to study the expression of previously characterised genes, gene networks and gene interactions in wild-type, mutant or transgenic; plants.


Merkel cell carcinoma (MCC) is a rare aggressive skin tumor which shares histopathological and genetic features with small-cell lung carcinoma (SCLC), both are of neuroendocrine origin. Comparable to SCLC, MCC cell lines are classified into two different biochemical subgroups designated as 'Classic' and 'Variant'. With the aim to identify typical gene-expression signatures associated with these phenotypically different MCC cell lines subgroups and to search for differentially expressed genes between MCC and SCLC, we used cDNA arrays to pro. le 10 MCC cell lines and four SCLC cell lines. Using significance analysis of microarrays, we defined a set of 76 differentially expressed genes that allowed unequivocal identification of Classic and Variant MCC subgroups. We assume that the differential expression levels of some of these genes reflect, analogous to SCLC, the different biological and clinical properties of Classic and Variant MCC phenotypes. Therefore, they may serve as useful prognostic markers and potential targets for the development of new therapeutic interventions specific for each subgroup. Moreover, our analysis identified 17 powerful classifier genes capable of discriminating MCC from SCLC. Real-time quantitative RT-PCR analysis of these genes on 26 additional MCC and SCLC samples confirmed their diagnostic classification potential, opening opportunities for new investigations into these aggressive cancers.