36 resultados para Genome-Wide Association
Resumo:
Prostate cancer (PC) is a significant economic and health burden in the U.S. and Europe but its causes are largely unknown. The most significant risk factors (after gender) are age and family history of the disease. A gene with high penetrance but low frequency on chromosome 1q, HPC 1, has been suggested to cause a proportion of the familial aggregation of PC but other more common genes, conferring less risk, are also thought to contribute to disease predisposition. We have pursued a strategy to study both types of genetic risk in PC. To identify high penetrance genes, affected men from thirteen families have been genotyped for genetic linkage analysis at six microsatellite markers spanning 45 cM of 1q24-25. Both LOD score and non-parametric statistics provide no significant support for HPC1 in this genomic region, although 3 of the families did combine to produce a LOD score of 0.9. These families will be included in a genome wide search for other PC predisposition genes as part of a multinational collaboration.^ For study of common genetic factors in PC development, leukocyte DNA samples from an unselected series of 55 patients and 67 controls have been examined for genetic differences in two other candidate genes, the androgen receptor gene, hAR, at Xq11-12, and the vitamin D receptor gene, hVDR, at 12q12-14. hAR was typed for two trinucleotide repeat length polymorphisms, (CAG)$\rm\sb{n}$ and (GGC)$\rm\sb{n},$ encoding polyglutamine and polyglycine tracts, respectively, which have been implicated in PC susceptibility. These data, combined with similarly processed patients and controls from the U.K. show no consistent association of allele length with PC risk. A novel finding, however, has been a significant association between the number of GGC repeats and the length of time between diagnosis and relapse in stage T1-T4 Caucasian patients irrespective of therapy and age of the patient. Of 49 patients who relapsed out of 108 entering the study, those with 16 or fewer GGC repeats had an average relapse-free-period of 101 (+/$-$7.7) months while for those with more than 16 repeats the period averaged 48 (+/$-$2.9) months, a difference of 2.1 fold or 4.4 years.^ The second gene, hVDR, was genotyped at two polymorphisms, a synonymous C/T substitution in exon 9 identified by differential TaqI enzymatic digestion and a variable length polyA tract in the 3$\sp\prime$ UTR. Although these polymorphisms are in strong linkage disequilibrium only the polyA region showed a possible association with PC risk. Men homozygous for alleles with fewer than 18 A's had an increased risk (OR = 3.0, p = 0.0578) compared to controls. This result is opposite to the findings of others and may either indicate off-setting random errors which together balance out to no significant overall effect or reflect more complex genetic and/or environmental associations.^ Overall, this research suggests that single gene familial predisposition may be less prominent in PC than in other cancers and that the characteristics of PC pathology may be useful in identifying the effects of common genetic factors. ^
Resumo:
DNA sequence variation is currently a major source of data for studying human origins, evolution, and demographic history, and for detecting linkage association of complex diseases. In this dissertation, I investigated DNA variation in worldwide populations from two ∼10 kb autosomal regions on 22q11.2 (noncoding) and 1q24 (introns). A total of 75 variant sites were found among 128 human sequences in the 22q11.2 region, yielding an estimate of 0.088% for nucleotide diversity (π), and a total of 52 variant sites were found among 122 human sequences in the 1q24 region with an estimated π value of 0.057%. The data from these two regions and a 10 kb noncoding region on Xq13.3 all show a strong excess of low-frequency variants in comparison to that expected from an equilibrium population, indicating a relatively recent population expansion. The effective population sizes estimated from the three regions were 11,000, 12,700, and 8,600, respectively, which are close to the commonly used value of 10,000. In each of the two autosomal regions, the age of the most recent common ancestor (MRCA) was estimated to be older than 1 million years among all the sequences and ∼600,000 years among non-African sequences, providing first evidence from autosomal noncoding or intronic regions for a genetic history of humans much more ancient than the emergence of modern humans. The ancient genetic history of humans indicates no severe bottleneck during the evolution of humans in the last half million years; otherwise, much of the ancient genetic history would have been lost during a severe bottleneck. This study strongly suggests that both the “out of Africa” and the multiregional models are too simple for explaining the evolution of modern humans. A compilation of genome-wide data revealed that nucleotide diversity is highest in autosomal regions, intermediate in X-linked regions, and lowest in Y-linked regions. The data suggest the existence of background selection or selective sweep on Y-linked loci. In general, the nucleotide diversity in humans is low compared to that in chimpanzee and Drosophila populations. ^
Resumo:
Linkage disequilibrium (LD) is defined as the nonrandom association of alleles at two or more loci in a population and may be a useful tool in a diverse array of applications including disease gene mapping, elucidating the demographic history of populations, and testing hypotheses of human evolution. However, the successful application of LD-based approaches to pertinent genetic questions is hampered by a lack of understanding about the forces that mediate the genome-wide distribution of LD within and between human populations. Delineating the genomic patterns of LD is a complex task that will require interdisciplinary research that transcends traditional scientific boundaries. The research presented in this dissertation is predicated upon the need for interdisciplinary studies and both theoretical and experimental projects were pursued. In the theoretical studies, I have investigated the effect of genotyping errors and SNP identification strategies on estimates of LD. The primary importance of these two chapters is that they provide important insights and guidance for the design of future empirical LD studies. Furthermore, I analyzed the allele frequency distribution of 26,530 single nucleotide polymorphisms (SNPs) in three populations and generated the first-generation natural selection map of the human genome, which will be an important resource for explaining and understanding genomic patterns of LD. Finally, in the experimental study, I describe a novel and simple, low-cost, and high-throughput SNP genotyping method. The theoretical analyses and experimental tools developed in this dissertation will facilitate a more complete understanding of patterns of LD in human populations. ^
Resumo:
Linkage and association studies are major analytical tools to search for susceptibility genes for complex diseases. With the availability of large collection of single nucleotide polymorphisms (SNPs) and the rapid progresses for high throughput genotyping technologies, together with the ambitious goals of the International HapMap Project, genetic markers covering the whole genome will be available for genome-wide linkage and association studies. In order not to inflate the type I error rate in performing genome-wide linkage and association studies, multiple adjustment for the significant level for each independent linkage and/or association test is required, and this has led to the suggestion of genome-wide significant cut-off as low as 5 × 10 −7. Almost no linkage and/or association study can meet such a stringent threshold by the standard statistical methods. Developing new statistics with high power is urgently needed to tackle this problem. This dissertation proposes and explores a class of novel test statistics that can be used in both population-based and family-based genetic data by employing a completely new strategy, which uses nonlinear transformation of the sample means to construct test statistics for linkage and association studies. Extensive simulation studies are used to illustrate the properties of the nonlinear test statistics. Power calculations are performed using both analytical and empirical methods. Finally, real data sets are analyzed with the nonlinear test statistics. Results show that the nonlinear test statistics have correct type I error rates, and most of the studied nonlinear test statistics have higher power than the standard chi-square test. This dissertation introduces a new idea to design novel test statistics with high power and might open new ways to mapping susceptibility genes for complex diseases. ^
Resumo:
Hypertension (HT) is mediated by the interaction of many genetic and environmental factors. Previous genome-wide linkage analysis studies have found many loci that show linkage to HT or blood pressure (BP) regulation, but the results were generally inconsistent. Gene by environment interaction is among the reasons that potentially explain these inconsistencies between studies. Here we investigate influences of gene by smoking (GxS) interaction on HT and BP in European American (EA), African American (AA) and Mexican American (MA) families from the GENOA study. A variance component-based method was utilized to perform genome-wide linkage analysis of systolic blood pressure (SBP), diastolic blood pressure (DBP), and HT status, as well as bivariate analysis for SBP and DBP for smokers, non-smokers, and combined groups. The most significant results were found for SBP in MA. The strongest signal was for chromosome 17q24 (LOD = 4.2), increased to (LOD = 4.7) in bivariate analysis but there was no evidence of GxS interaction at this locus (p = 0.48). Two signals were identified only in one group: on chromosome 15q26.2 (LOD = 3.37) in non-smokers and chromosome 7q21.11 (LOD = 1.4) in smokers, both of which had strong evidence for GxS interaction (p = 0.00039 and 0.009 respectively). There were also two other signals, one on chromosome 20q12 (LOD = 2.45) in smokers, which became much higher in the combined sample (LOD = 3.53), and one on chromosome 6p22.2 (LOD = 2.06) in non-smokers. Neither peak had very strong evidence for GxS interaction (p = 0.08 and 0.06 respectively). A fine mapping association study was performed using 200 SNPs in 30 genes located under the linkage signals on chromosomes 15 and 17. Under the chromosome 15 peak, the association analysis identified 6 SNPs accounting for a 7 mmHg increase in SBP in MA non-smokers. For the chromosome 17 linkage peak, the association analysis identified 3 SNPs accounting for a 6 mmHg increase in SBP in MA. However, none of these SNPs was significant after correcting for multiple testing, and accounting for them in the linkage analysis produced very small reductions in the linkage signal. ^ The linkage analysis of BP traits considering the smoking status produced very interesting signals for SBP in the MA population. The fine mapping association analysis gave some insight into the contribution of some SNPs to two of the identified signals, but since these SNPs did not remain significant after multiple testing correction and did not explain the linkage peaks, more work is needed to confirm these exploratory results and identify the culprit variations under these linkage peaks. ^
Resumo:
Clubfoot is a common, complex birth defect affecting 4,000 newborns in the United States and 135,000 world-wide each year. The clubfoot deformity is characterized by inward and rigid downward displacement of one or both feet, along with persistent calf muscle hypoplasia. Despite strong evidence for a genetic liability, there is a limited understanding of the genetic and environmental factors contributing to the etiology of clubfoot. The studies described in this dissertation were performed to identify variants and/or genes associated with clubfoot. Genome-wide linkage scan performed on ten multiplex clubfoot families identified seven new chromosomal regions that provide new areas to search for clubfoot genes. Troponin C (TNNC2) the strongest candidate gene, located in 20q12-q13.11, is involved in muscle contraction. Exon sequencing of TNNC2 did not identify any novel coding variants. Interrogation of fifteen muscle contraction genes found strong associations with SNPs located in potential regulatory regions of TPM1 (rs4075583 and rs3805965), TPM2 (rs2025126 and rs2145925) and TNNC2 (rs383112 and rs437122). In previous studies, a strong association was found with rs3801776 located in the basal promoter of HOXA9, a gene also involved in muscle development and patterning. Altogether, this data suggests that SNPs located in potential regulatory regions of genes involved in muscle development and function could alter transcription factor binding leading to changes in gene expression. Functional analysis of 3801776/HOXA9, rs2025126/TPM2 and rs2145925/TPM2 showed altered protein binding, which significantly influenced promoter activity. Although the ancestral allele (G) of rs4075583/TPM1 creates a DNA-protein complex, it did not affect TPM1 promoter activity. However and importantly, in the context of a haplotype, rs4075583/G significantly decreased TPM1 promoter activity. These results suggest dysregulation of multiple skeletal muscle genes, TPM1, TPM2, TNNC2 and HOXA9, working in concert may contribute to clubfoot. However, specific allelic combinations involving these four regulatory SNPs did not confer a significantly higher risk for clubfoot. Other combinations of these variants are being evaluated. Moreover, these variants may interact with yet to be discovered variants in other genes to confer a higher clubfoot risk. Collectively, we show novel evidence for the role of skeletal muscle genes in clubfoot indicating that there are multiple genetic factors contributing to this complex birth defect.