20 resultados para Dinucleotide Repeat Polymorphism
em DigitalCommons@The Texas Medical Center
Resumo:
The myogenin gene encodes an evolutionarily conserved basic helix-loop-helix transcription factor that regulates the expression of skeletal muscle-specific genes and its homozygous deletion results in mice who die of respiratory failure at birth. The histology of skeletal muscle in the myogenin null mice is reminiscent of that found in some severe congenital myopathy patients, many of whom also die of respiratory complications and provides the rationale that an aberrant human myogenin (myf4) coding region could be associated with some congenital myopathy conditions.^ With PCR, we found similarly sized amplimers for the three exons of the myogenin gene in 37 patient and 40 control samples. In contrast to the GeneBank sequence for human myogenin, we report several differences in flanking and coding regions plus an additional 659 and 498 bps in the first and second introns, respectively, in all patients and controls. We also find a novel (CA)-dinucleotide repeat in the second intron. No causative mutations were detected in the myogenin coding regions of genomic DNA from patients with severe congenital myopathy.^ Severe congenital myopathies in humans are often associated with respiratory complications and pulmonary hypoplasia. We have employed the myogenin null mouse, which lacks normal development of skeletal muscle fibers as a genetically defined severe congenital myopathy mouse model to evaluate the effect of absent fetal breathing movement on pulmonary development.^ Significant differences are observed at embryonic days E14, E17 and E20 of lung:body weight, total DNA and histologically, suggesting that the myogenin null lungs are hypoplastic. RT-PCR, in-situ immunofluorescence and EM reveal pneumocyte type II differentiation in both null and wild lungs as early as E14. However, at E14, myogenin null lungs have decreased BrdU incorporation while E17 through term, augmented cell death is detected in the myogenin null lungs, not seen in wild littermates. Absent mechanical forces appear to impair normal growth, but not maturation, of the developing lungs in myogenin null mouse.^ These investigations provide the basis for delineating the DNA sequence of the myogenin gene and and highlight the importance of skeletal muscle development in utero for normal lung organogenesis. My observation of no mutations within the coding regions of the human myogenin gene in DNA from patients with severe congenital myopathy do not support any association with this condition. ^
Resumo:
Gene silencing due to promoter methylation is an alternative to mutations and deletions, which inactivate tumor suppressor genes (TSG) in cancer. We identified RIL by Methylated CpG Island Amplification technique as a novel aberrantly methylated gene. RIL is expressed in normal tissues and maps to the 5q31 region, frequently deleted in leukemias. We found methylation of RIL in 55/80 (69%) cancer cell lines, with highest methylation in leukemia and colon. We also observed methylation in 46/80 (58%) primary tumors, whereas normal tissues showed substantially lower degrees of methylation. RIL expression was lost in 13/16 cancer cell lines and was restored by demethylating agent. Screening of 38 cell lines and 13 primary cancers by SSCP revealed no mutations in RIL, suggesting that methylation and LOH are the primary inactivation mechanisms. Stable transfection of RIL into colorectal cancer cells resulted in reduction in cell growth, clonogenicity, and increased apoptosis upon UVC treatment, suggesting that RIL is a good candidate TSG. ^ In searching for a cause of RIL hypermethylation, we identified a 12-bp polymorphic sequence around the transcription start site of the gene that creates a long allele containing 3CTC repeat. Evolutionary studies suggested that the long allele appeared late in evolution due to insertion. Using bisulfite sequencing, in cancers heterozygous for RIL, we found that the short allele is 4.4-fold more methylated than the long allele (P = 0.003). EMSA results suggested binding of factor(s) to the inserted region of the long allele, but not to the short. EMSA mutagenesis and competition studies, as well as supershifts using nuclear extracts or recombinant Sp1 strongly indicated that those DNA binding proteins are Sp1 and Sp3. Transient transfections of RIL allele-specific expression constructs showed less than 2-fold differences in luciferase activity, suggesting no major effects of the additional Sp1 site on transcription. However, stable transfection resulted in 3-fold lower levels of transcription from the short allele 60 days post-transfection, consistent with the concept that the polymorphic Sp1 site protects against time-dependent silencing. Thus, an insertional polymorphism in the RIL promoter creates an additional Sp1/Sp3 site, which appears to protect it from silencing and methylation in cancer. ^
Resumo:
Hundreds of genes show aberrant DNA hypermethylation in cancer, yet little is known about the causes of this hypermethylation. We identified RIL as a frequent methylation target in cancer. In search for factors that influence RIL hypermethylation, we found a 12-bp polymorphic sequence around its transcription start site that creates a long allele. Pyrosequencing of homozygous tumors revealed a 2.1-fold higher methylation for the short alleles (P<0.001). Bisulfite sequencing of cancers heterozygous for RIL showed that the short alleles are 3.1-fold more methylated than the long (P<0.001). The comparison of expression levels between unmethylated long and short EBV-transformed cell lines showed no difference in expression in vivo. Electrophorectic mobility shift assay showed that the inserted region of the long allele binds Sp1 and Sp3 transcription factors, a binding that is absent in the short allele. Transient transfection of RIL allele-specific transgenes showed no effects of the additional Sp1 site on transcription early on. However, stable transfection of methylation-seeded constructs showed gradually decreasing transcription levels from the short allele with eventual spreading of de novo methylation. In contrast, the long allele showed stable levels of expression over time as measured by luciferase and approximately 2-3-fold lower levels of methylation by bisulfite sequencing (P<0.001), suggesting that the polymorphic Sp1 site protects against time-dependent silencing. Our finding demonstrates that, in some genes, hypermethylation in cancer is dictated by protein-DNA interactions at the promoters and provides a novel mechanism by which genetic polymorphisms can influence an epigenetic state.
Resumo:
OBJECTIVE: We hypothesized that, similar to idiopathic hip osteonecrosis, the T-786C mutation of the endothelial nitric oxide synthase (eNOS) gene affecting nitric oxide (NO) production was associated with neuralgia-inducing cavitational osteonecrosis of the jaws (NICO). DESIGN: In 22 NICO patients, not having taken bisphosphonates, mutations affecting NO production (eNOS T-786C, stromelysin 5A6A) were measured by polymerase chain reaction. Two healthy normal control subjects were matched per case by race and gender. RESULTS: Homozygosity for the mutant eNOS allele (TT) was present in 6 out of 22 patients (27%) with NICO compared with 0 out of 44 (0%) race and gender-matched control subjects; heterozygosity (TC) was present in 8 patients (36%) versus 15 control subjects (34%); and the wild-type normal genotype (CC) was present in 9 patients (36%) versus 29 controls (66%) (P = .0008). The mutant eNOS T-786C allele was more common in cases (20 out of 44 [45%]) than in control subjects (15 out of 88 [17%]) (P = .0005). The distribution of the stromelysin 5A6A genotype in cases did not differ from control subjects (P = .13). CONCLUSIONS: The eNOS T-786C polymorphism affecting NO production is associated with NICO, may contribute to the pathogenesis of NICO, and may open therapeutic medical approaches to treatment of NICO through provision of L-arginine, the amino-acid precursor of NO.
Resumo:
Evidence for an RNA gain-of-function toxicity has now been provided for an increasing number of human pathologies. Myotonic dystrophies (DM) belong to a class of RNA-dominant diseases that result from RNA repeat expansion toxicity. Specifically, DM of type 1 (DM1), is caused by an expansion of CUG repeats in the 3'UTR of the DMPK protein kinase mRNA, while DM of type 2 (DM2) is linked to an expansion of CCUG repeats in an intron of the ZNF9 transcript (ZNF9 encodes a zinc finger protein). In both pathologies the mutant RNA forms nuclear foci. The mechanisms that underlie the RNA pathogenicity seem to be rather complex and not yet completely understood. Here, we describe Drosophila models that might help unravelling the molecular mechanisms of DM1-associated CUG expansion toxicity. We generated transgenic flies that express inducible repeats of different type (CUG or CAG) and length (16, 240, 480 repeats) and then analyzed transgene localization, RNA expression and toxicity as assessed by induced lethality and eye neurodegeneration. The only line that expressed a toxic RNA has a (CTG)(240) insertion. Moreover our analysis shows that its level of expression cannot account for its toxicity. In this line, (CTG)(240.4), the expansion inserted in the first intron of CG9650, a zinc finger protein encoding gene. Interestingly, CG9650 and (CUG)(240.4) expansion RNAs were found in the same nuclear foci. In conclusion, we suggest that the insertion context is the primary determinant for expansion toxicity in Drosophila models. This finding should contribute to the still open debate on the role of the expansions per se in Drosophila and in human pathogenesis of RNA-dominant diseases.
Resumo:
Conditioned stimulus pathway protein 24 (Csp24) is a beta-thymosin-like protein that is homologous to other members of the family of beta-thymosin repeat proteins that contain multiple actin binding domains. Actin co-precipitates with Csp24 and co-localizes with it in the cytosol of type-B photoreceptor cell bodies. Several signal transduction pathways have been shown to regulate the phosphorylation of Csp24 and contribute to cellular plasticity. Here, we report the identification of the adapter protein 14-3-3 in lysates of the Hermissenda circumesophageal nervous system and its interaction with Csp24. Immunoprecipitation experiments using an antibody that is broadly reactive with several isoforms of the 14-3-3 family of proteins showed that Csp24 co-precipitates with 14-3-3 protein, and nervous systems stimulated with 5-HT exhibited a significant increase in co-precipitated Csp24 probed with a phosphospecific antibody as compared with controls. These results indicate that post-translational modifications of Csp24 regulate its interaction with 14-3-3 protein, and suggest that this mechanism may contribute to the control of intrinsic enhanced excitability.
Resumo:
OBJECTIVES: We evaluated ankyrin repeat domain 1 (ANKRD1), the gene encoding cardiac ankyrin repeat protein (CARP), as a novel candidate gene for dilated cardiomyopathy (DCM) through mutation analysis of a cohort of familial or idiopathic DCM patients, based on the hypothesis that inherited dysfunction of mechanical stretch-based signaling is present in a subset of DCM patients. BACKGROUND: CARP, a transcription coinhibitor, is a member of the titin-N2A mechanosensory complex and translocates to the nucleus in response to stretch. It is up-regulated in cardiac failure and hypertrophy and represses expression of sarcomeric proteins. Its overexpression results in contractile dysfunction. METHODS: In all, 208 DCM patients were screened for mutations/variants in the coding region of ANKRD1 using polymerase chain reaction, denaturing high-performance liquid chromatography, and direct deoxyribonucleic acid sequencing. In vitro functional analyses of the mutation were performed using yeast 2-hybrid assays and investigating the effect on stretch-mediated gene expression in myoblastoid cell lines using quantitative real-time reverse transcription-polymerase chain reaction. RESULTS: Three missense heterozygous ANKRD1 mutations (P105S, V107L, and M184I) were identified in 4 DCM patients. The M184I mutation results in loss of CARP binding with Talin 1 and FHL2, and the P105S mutation in loss of Talin 1 binding. Intracellular localization of mutant CARP proteins is not altered. The mutations result in differential stretch-induced gene expression compared with wild-type CARP. CONCLUSIONS: ANKRD1 is a novel DCM gene, with mutations present in 1.9% of DCM patients. The ANKRD1 mutations may cause DCM as a result of disruption of the normal cardiac stretch-based signaling.
Resumo:
Myotonic dystrophy (DM), an autosomal dominant disorder mapping to human chromosome 19q13.3, is the most common neuromuscular disease in human adults.^ Following the identification of the mutation underlying the DM phenotype, an unstable (CTG)$\sb{n}$ trinucleotide repeat in the 3$\prime$ untranslated region (UTR) of a gene encoding a ser/thr protein kinase named DM protein kinase (DMPK), the study was targeted at two questions: (1) the identification of the disease-causing mechanism(s) of the unstable repeat, and at a more basic level, (2) the identification of the origin and the mechanism(s) involved in repeat instability. The first goal was to identify the pathophysiological mechanisms of the (CTG)$\sb{n}$ repeat.^ The normal repeat is transcribed but not translated; therefore, initial studies centered on the effect on RNA transcript levels. The vast majority of DM affecteds are heterozygous for the mutant expansion, so that the normal allele interferes with the analysis of the mutant allele. A quantitative allele-specific RT-PCR procedure was developed and applied to a spectrum of patient tissue samples and cell lines. Equal levels of unprocessed pre-mRNA were determined for the wild type (+) and disease (DM) alleles in skeletal muscle and cell lines of heterozygous DM patients, indicating that any nucleosome binding has no effect at the level of transcriptional initiation and transcription of the mutant DMPK locus. In contrast, processed mRNA levels from the DM allele were reduced relative to the + allele as the size of the expansion increased. The unstable repeat, therefore, impairs post-transcriptional processing of DM allele transcripts. This phenomenon has profound effects on overall DMPK locus steady-state transcript levels in cells missing a wild type allele and does not appear to be mediated by imprinting, decreased mRNA stability, generation of aberrant splice forms, or absence of polyadenylation of the mutant allele.^ In Caucasian DM subjects, the unstable repeat is in complete linkage disequlibrium with a single haplotype composed of nine alleles within and flanking DMPK over a physical distance of 30 kb. A detailed haplotype analysis of the DM region was conducted on a Nigerian (Yoruba) DM family, the only indigenous sub-Saharan DM case reported to date. Each affected member of this family had an expanded (CTG)$\sb{n}$ repeat in one of their DMPK alleles. However, unlike all other DM populations studied thus far, disassociation of the (CTG)$\sb{n}$ repeat expansion from other alleles of the putative predisposing haplotype was found. Thus, the expanded (CTG)$\sb{n}$ repeat in this family was the result of an independent mutational event. Consequently, the origin of DM is unlikely the result of a single mutational event, and the hypothesis that a single ancestral haplotype predisposes to repeat expansion is not compelling. (Abstract shortened by UMI.) ^
Resumo:
Objective. Essential hypertension affects 25% of the US adult population and is a leading contributor to morbidity and mortality. Because BP is a multifactorial phenotype that resists simple genetic analysis, intermediate phenotypes within the complex network of BP regulatory systems may be more accessible to genetic dissection. The Renin-Angiotensin System (RAS) is known to influence intermediate and long-term blood pressure regulation through alterations in vascular tone and renal sodium and fluid resorption. This dissertation examines associations between renin (REN), angiotensinogen (AGT), angiotensin-converting enzyme (ACE) and angiotensin II type 1 receptor (AT1) gene variation and interindividual differences in plasma hormone levels, renal hemodynamics, and BP homeostasis.^ Methods. A total of 150 unrelated men and 150 unrelated women, between 20.0 and 49.9 years of age and free of acute or chronic illness except for a history of hypertension (11 men and 7 women, all off medications), were studied after one week on a controlled sodium diet. RAS plasma hormone levels, renal hemodynamics and BP were determined prior to and during angiotensin II (Ang II) infusion. Individuals were genotyped by PCR for a variable number tandem repeat (VNTR) polymorphism in REN, and for the following restriction fragment length polymorphisms (RFLP): AGT M235T, ACE I/D, and AT1 A1166C. Associations between clinical measurements and allelic variation were examined using multiple linear regression statistical models.^ Results. Women homozygous for the AT1 1166C allele demonstrated higher intracellular levels of sodium (p = 0.044). Men homozygous for the AGT T235 allele demonstrated a blunted decrement in renal plasma flow in response to Ang II infusion (p = 0.0002). There were no significant associations between RAS gene variation and interindividual variation in RAS plasma hormone levels or BP.^ Conclusions. Rather than identifying new BP controlling genes or alleles, the study paradigm employed in this thesis (i.e., measured genes, controlled environments and interventions) may provide mechanistic insight into how candidate genes affect BP homeostasis. ^
Resumo:
Coronary heart disease (CHD) is the leading cause of death in the United States. Recently, renin-angiotensin system (RAS) was found associated with atherosclerosis formation, with angiotensin II inducing vascular smooth muscle cell growth and migration, platelet activation and aggregation, and stimulation of plasminogen activator inhibitor-1. Angiotensin II is converted from angiotensin I by angiotensin I-converting enzyme (ACE) and this enzyme is mainly genetically determined. The ACE gene has been assigned to chromosome 17q23 and an insertion/deletion (I/D)polymorphism has been characterized by the presence/absence of a 287 bp fragment in intron 16 of the gene. The two alleles form three genotypes, namely, DD, ID and II and the DD genotype has been linked to higher plasma ACE levels and cell ACE activity.^ In this study, the association between the ACE I/D polymorphism and carotid artery wall thickness measured by B-mode ultrasound was investigated in a biracial sample, and the association between the gene and incident CHD was investigated in whites and if the gene-CHD association in whites, if any, was due to the gene effect on atherosclerosis. The study participants are from the prospective Atherosclerosis Risk in Communities (ARIC) Study, including adults aged 45 to 65 years. The present dissertation used a matched case-control design for studying the associations of the ACE gene with carotid artery atherosclerosis and an unmatched case-control design for the association of the gene with CHD. A significant recessive effect of the D allele on carotid artery thickness was found in blacks (OR = 3.06, 95% C.I: 1.11-8.47, DD vs. ID and II) adjusting for age, gender, cigarette smoking, LDL-cholesterol and diabetes. No similar associations were found in whites. The ACE I/D polymorphism is significantly associated with coronary heart disease in whites, and while stratifying data by carotid artery wall thickness, the significant associations were only observed in thin-walled subgroups. Assuming a recessive effect of the D allele, odds ratio was 2.84 (95% C.I:1.17-6.90, DD vs. ID and II) and it was 2.30 (95% C.I:1.22-4.35, DD vs. ID vs. II) assuming a codominant effect of the D allele. No significant associations were observed while comparing thick-walled CHD cases with thin-walled controls. Following conclusions could be drawn: (1) The ACE I/D polymorphism is unlikely to confer appreciable increase in the risk of carotid atherosclerosis in US whites, but may increases the risk of carotid atherosclerosis in blacks. (2) ACE I/D polymorphism is a genetic risk factor for incident CHD in US whites and this effect is separate from the chronic process of atherosclerosis development. Finally, the associations observed here are not causal, since the I/D polymorphism is in an intron, where no ACE proteins are encoded. ^
Effect of cancer chemotherapy on the frequency of minisatellite repeat number changes in human sperm
Resumo:
The objective of this study was to determine whether cancer chemotherapy induces detectable mutations in DNA of the human germline and whether minisatellite repeat number changes can be used as a sensitive indicator of genetic damage in human sperm caused by mutagens. We compared the mutation frequencies in sperm of the same cancer patients pre- and post-, pre- and during, or during and post-treatment. Small pool polymerase chain reaction (SP-PCR) (DNA equivalent to approximately 100 sperm) and Southern blotting techniques were used to detect mutations and quantify the frequency of repeat number changes at the minisatellite MS205 locus. One pre- and one post-treatment semen sample was obtained from each Hodgkin's disease patient treated with either: (1) a regimen without alkylating agents, Novantrone, Oncovin, Vinblastine, and Prednisone (NOVP), 4 patients; (2) a regimen containing alkylating agents, Cytoxan, Vinblastine, Procarbazine, and Prednisone (CVPP)/Adriamycin, Bleomycin, DTIC, CCNU, and Prednisone (ABDIC), 2 patients; and (3) a regimen containing alkylating agents, Mechlorethamine, Oncovin, Procarbazine, and Prednisone (MOPP), 1 patient. One pre- and one during treatment semen sample from each of two Hodgkin's disease patients treated with Adriamycin, Bleomycin, Vinblastine, and Dacarbazine (ABVD) were obtained. One during and one post-treatment semen sample from a Hodgkin's disease patient treated with NOVP were also obtained. At least 7900 sperm in each sample were screened for the repeat number changes at the MS205 locus by multi-aliquots of SP-PCR. The mutation frequencies of pre- and post-treatment for the four patients treated with NOVP were 0.22 and 0.18%; 0.24 and 0.16%; 0.35 and 0.28%; and 0.19 and 0.18%. With CVPP/ABDIC, they were 0.22 and 0.23%; and 0.94 and 0.98% for the two patients and with MOPP they were 0.79 and 1.14%. The mutation frequencies of pre- and during treatment with ABVD were 0.09 and 0.07%; and 0.34 and 0.27% for the two patients. The mutation frequencies of during and post-treatment with NOVP for one patient were 0.31 and 0.25%. A statistically significant increase in mutation frequency was only found in the patient treated with MOPP. According to the time of samples collected after or during treatment and the above results, we conclude that there is no effect of NOVP and CVPP/ABDIC regimens on the mutation frequency in spermatogonia. The spermatocytes are not highly sensitive to chemotherapy agents compared to spermatogonia at the minisatellite MS205 locus. MOPP treatment may increase the mutation frequency at the MS205 locus in spermatogonia. ^
Resumo:
Background and purpose: Breast cancer continues to be a health problem for women, representing 28 percent of all female cancers and remaining one of the leading causes of death for women. Breast cancer incidence rates become substantial before the age of 50. After menopause, breast cancer incidence rates continue to increase with age creating a long-lasting source of concern (Harris et al., 1992). Mammography, a technique for the detection of breast tumors in their nonpalpable stage when they are most curable, has taken on considerable importance as a public health measure. The lifetime risk of breast cancer is approximately 1 in 9 and occurs over many decades. Recommendations are that screening be periodic in order to detect cancer at early stages. These recommendations, largely, are not followed. Not only are most women not getting regular mammograms, but this circumstance is particularly the case among older women where regular mammography has been proven to reduce mortality by approximately 30 percent. The purpose of this project was to increase our understanding of factors that are associated with stage of readiness to obtain subsequent mammograms. A secondary purpose of this research was to suggest further conceptual considerations toward the extension of the Transtheoretical Model (TTM) of behavior change to repeat screening mammography. ^ Methods. A sample (n = 1,222) of women 50 years and older in a large multi-specialty clinic in Houston, Texas was surveyed by mail questionnaire regarding their previous screening experience and stage of readiness to obtain repeat screening. A computerized database, maintained on all women who undergo mammography at the clinic, was used to identify women who are eligible for the project. The major statistical technique employed to select the significant variables and to examine the man and interaction effects of independent variables on dependent variables was polychotomous stepwise, logistic regression. A prediction model for each stage of readiness definition was estimated. The expected probabilities for stage of readiness were calculated to assess the magnitude and direction of significant predictors. ^ Results. Analysis showed that both ways of defining stage of readiness for obtaining a screening mammogram were associated with specific constructs, including decisional balance and processes of the change. ^ Conclusions. The results of the present study demonstrate that the TTM appears to translate to repeat mammography screening. Findings in the current study also support finding of previous studies that suggest that stage of readiness is associated with respondent decisional balance and the processes of change. ^
Resumo:
DNA sequence variation is currently a major source of data for studying human origins, evolution, and demographic history, and for detecting linkage association of complex diseases. In this dissertation, I investigated DNA variation in worldwide populations from two ∼10 kb autosomal regions on 22q11.2 (noncoding) and 1q24 (introns). A total of 75 variant sites were found among 128 human sequences in the 22q11.2 region, yielding an estimate of 0.088% for nucleotide diversity (π), and a total of 52 variant sites were found among 122 human sequences in the 1q24 region with an estimated π value of 0.057%. The data from these two regions and a 10 kb noncoding region on Xq13.3 all show a strong excess of low-frequency variants in comparison to that expected from an equilibrium population, indicating a relatively recent population expansion. The effective population sizes estimated from the three regions were 11,000, 12,700, and 8,600, respectively, which are close to the commonly used value of 10,000. In each of the two autosomal regions, the age of the most recent common ancestor (MRCA) was estimated to be older than 1 million years among all the sequences and ∼600,000 years among non-African sequences, providing first evidence from autosomal noncoding or intronic regions for a genetic history of humans much more ancient than the emergence of modern humans. The ancient genetic history of humans indicates no severe bottleneck during the evolution of humans in the last half million years; otherwise, much of the ancient genetic history would have been lost during a severe bottleneck. This study strongly suggests that both the “out of Africa” and the multiregional models are too simple for explaining the evolution of modern humans. A compilation of genome-wide data revealed that nucleotide diversity is highest in autosomal regions, intermediate in X-linked regions, and lowest in Y-linked regions. The data suggest the existence of background selection or selective sweep on Y-linked loci. In general, the nucleotide diversity in humans is low compared to that in chimpanzee and Drosophila populations. ^
Resumo:
Natural selection is one of the major factors in the evolution of all organisms. Detecting the signature of natural selection has been a central theme in evolutionary genetics. With the availability of microsatellite data, it is of interest to study how natural selection can be detected with microsatellites. ^ The overall aim of this research is to detect signatures of natural selection with data on genetic variation at microsatellite loci. The null hypothesis to be tested is the neutral mutation theory of molecular evolution, which states that different alleles at a locus have equivalent effects on fitness. Currently used tests of this hypothesis based on data on genetic polymorphism in natural populations presume that mutations at the loci follow the infinite allele/site models (IAM, ISM), in the sense that at each site at most only one mutation event is recorded, and each mutation leads to an allele not seen before in the population. Microsatellite loci, which are abundant in the genome, do not obey these mutation models, since the new alleles at such loci can be created either by contraction or expansion of tandem repeat sizes of core motifs. Since the current genome map is mainly composed of microsatellite loci and this class of loci is still most commonly studied in the context of human genome diversity, this research explores how the current test procedures for testing the neutral mutation hypothesis should be modified to take into account a generalized model of forward-backward stepwise mutations. In addition, recent literature also suggested that past demographic history of populations, presence of population substructure, and varying rates of mutations across loci all have confounding effects for detecting signatures of natural selection. ^ The effects of the stepwise mutation model and other confounding factors on detecting signature of natural selection are the main results of the research. ^
Resumo:
With hundreds of single nucleotide polymorphisms (SNPs) in a candidate gene and millions of SNPs across the genome, selecting an informative subset of SNPs to maximize the ability to detect genotype-phenotype association is of great interest and importance. In addition, with a large number of SNPs, analytic methods are needed that allow investigators to control the false positive rate resulting from large numbers of SNP genotype-phenotype analyses. This dissertation uses simulated data to explore methods for selecting SNPs for genotype-phenotype association studies. I examined the pattern of linkage disequilibrium (LD) across a candidate gene region and used this pattern to aid in localizing a disease-influencing mutation. The results indicate that the r2 measure of linkage disequilibrium is preferred over the common D′ measure for use in genotype-phenotype association studies. Using step-wise linear regression, the best predictor of the quantitative trait was not usually the single functional mutation. Rather it was a SNP that was in high linkage disequilibrium with the functional mutation. Next, I compared three strategies for selecting SNPs for application to phenotype association studies: based on measures of linkage disequilibrium, based on a measure of haplotype diversity, and random selection. The results demonstrate that SNPs selected based on maximum haplotype diversity are more informative and yield higher power than randomly selected SNPs or SNPs selected based on low pair-wise LD. The data also indicate that for genes with small contribution to the phenotype, it is more prudent for investigators to increase their sample size than to continuously increase the number of SNPs in order to improve statistical power. When typing large numbers of SNPs, researchers are faced with the challenge of utilizing an appropriate statistical method that controls the type I error rate while maintaining adequate power. We show that an empirical genotype based multi-locus global test that uses permutation testing to investigate the null distribution of the maximum test statistic maintains a desired overall type I error rate while not overly sacrificing statistical power. The results also show that when the penetrance model is simple the multi-locus global test does as well or better than the haplotype analysis. However, for more complex models, haplotype analyses offer advantages. The results of this dissertation will be of utility to human geneticists designing large-scale multi-locus genotype-phenotype association studies. ^