17 resultados para phenotype ontology
Resumo:
My dissertation focuses on two aspects of RNA sequencing technology. The first is the methodology for modeling the overdispersion inherent in RNA-seq data for differential expression analysis. This aspect is addressed in three sections. The second aspect is the application of RNA-seq data to identify the CpG island methylator phenotype (CIMP) by integrating datasets of mRNA expression level and DNA methylation status. Section 1: The cost of DNA sequencing has reduced dramatically in the past decade. Consequently, genomic research increasingly depends on sequencing technology. However it remains elusive how the sequencing capacity influences the accuracy of mRNA expression measurement. We observe that accuracy improves along with the increasing sequencing depth. To model the overdispersion, we use the beta-binomial distribution with a new parameter indicating the dependency between overdispersion and sequencing depth. Our modified beta-binomial model performs better than the binomial or the pure beta-binomial model with a lower false discovery rate. Section 2: Although a number of methods have been proposed in order to accurately analyze differential RNA expression on the gene level, modeling on the base pair level is required. Here, we find that the overdispersion rate decreases as the sequencing depth increases on the base pair level. Also, we propose four models and compare them with each other. As expected, our beta binomial model with a dynamic overdispersion rate is shown to be superior. Section 3: We investigate biases in RNA-seq by exploring the measurement of the external control, spike-in RNA. This study is based on two datasets with spike-in controls obtained from a recent study. We observe an undiscovered bias in the measurement of the spike-in transcripts that arises from the influence of the sample transcripts in RNA-seq. Also, we find that this influence is related to the local sequence of the random hexamer that is used in priming. We suggest a model of the inequality between samples and to correct this type of bias. Section 4: The expression of a gene can be turned off when its promoter is highly methylated. Several studies have reported that a clear threshold effect exists in gene silencing that is mediated by DNA methylation. It is reasonable to assume the thresholds are specific for each gene. It is also intriguing to investigate genes that are largely controlled by DNA methylation. These genes are called “L-shaped” genes. We develop a method to determine the DNA methylation threshold and identify a new CIMP of BRCA. In conclusion, we provide a detailed understanding of the relationship between the overdispersion rate and sequencing depth. And we reveal a new bias in RNA-seq and provide a detailed understanding of the relationship between this new bias and the local sequence. Also we develop a powerful method to dichotomize methylation status and consequently we identify a new CIMP of breast cancer with a distinct classification of molecular characteristics and clinical features.
Resumo:
Mutations in the p53 tumor suppressor gene are found in over 50% of human tumors and in the germline of Li-Fraumeni syndrome families. About 80% of these mutations are missense in nature. In order to study how p53 missense mutations affect tumorigenesis in vivo, we focused on the murine p53 arg-to-his mutation at amino acid 172, which corresponds to the human hot spot mutation at amino acid 175. The double replacement procedure was employed to introduce the p53 R172H mutation into the p53 locus of ES cells and mice were generated. An additional 1bp deletion in the intron 2 splice acceptor site was detected in the same allele in mice. We named this allele p53R172HΔg. This allele makes a small amount of full length p53 mutant protein. ^ Spontaneous tumor formation and survival were studied in these mice. Mice heterozygous for the p53R172HΔg allele showed 50% survival at 17 months of age, similar to the p53+/− mice. Moreover, the p53R172HΔg/+ mice showed a distinct tumor spectrum: 55% sarcomas, including osteosarcoms, fibrosarcomas and angiosarcomas; 27% carcinomas, including lung adenocarcinomas, squamous cell carcinomas, hepatocellular carcinomas and islet cell carcinomas; and 18% lymphomas. Compared to the p53+/− mice, there was a clear increase in the frequency of carcinoma development and a decrease in lymphoma incidence. Among the sarcomas that developed, fibrosarcomas in the skin were also more frequently observed. More importantly, osteosarcomas and carinomas that developed in the p53R172HΔg/+ mice metastasized at very high frequency (64% and 67%, respectively) compared with less than 10% in the p53+/− mice. The metastatic lesions were usually found in lung and liver, and less frequently in other tissues. The altered tumor spectrum in the mice and increased metastatic potential of the tumors suggested that the p53R172H mutation represents a gain-of-function. ^ Mouse embryonic fibroblasts (MEFs) from the mice homozygous and heterozygous for the p53R172HΔg allele were studied for growth characteristics, immortalization potential and genomic instability. All of the p53R172HΔg /+ MEF lines are immortalized under a 3T3 protocol while under the same protocol p53+/− MEFs are not immortalized. Karyotype analysis showed a persistent appearance of chromosome end-to-end fusion in the MEFs both homozygous and heterozygous for the p53R172HΔg allele. These observations suggest that increased genomic instability in the cells may cause the altered tumor phenotypes. ^