11 resultados para Populations genetic
em DigitalCommons@The Texas Medical Center
Resumo:
Recent studies indicate that polymorphic genetic markers are potentially helpful in resolving genealogical relationships among individuals in a natural population. Genetic data provide opportunities for paternity exclusion when genotypic incompatibilities are observed among individuals, and the present investigation examines the resolving power of genetic markers in unambiguous positive determination of paternity. Under the assumption that the mother for each offspring in a population is unambiguously known, an analytical expression for the fraction of males excluded from paternity is derived for the case where males and females may be derived from two different gene pools. This theoretical formulation can also be used to predict the fraction of births for each of which all but one male can be excluded from paternity. We show that even when the average probability of exclusion approaches unity, a substantial fraction of births yield equivocal mother-father-offspring determinations. The number of loci needed to increase the frequency of unambiguous determinations to a high level is beyond the scope of current electrophoretic studies in most species. Applications of this theory to electrophoretic data on Chamaelirium luteum (L.) shows that in 2255 offspring derived from 273 males and 70 females, only 57 triplets could be unequivocally determined with eight polymorphic protein loci, even though the average combined exclusionary power of these loci was 73%. The distribution of potentially compatible male parents, based on multilocus genotypes, was reasonably well predicted from the allele frequency data available for these loci. We demonstrate that genetic paternity analysis in natural populations cannot be reliably based on exclusionary principles alone. In order to measure the reproductive contributions of individuals in natural populations, more elaborate likelihood principles must be deployed.
Resumo:
The interpretation of data on genetic variation with regard to the relative roles of different evolutionary factors that produce and maintain genetic variation depends critically on our assumptions concerning effective population size and the level of migration between neighboring populations. In humans, recent population growth and movements of specific ethnic groups across wide geographic areas mean that any theory based on assumptions of constant population size and absence of substructure is generally untenable. We examine the effects of population subdivision on the pattern of protein genetic variation in a total sample drawn from an artificial agglomerate of 12 tribal populations of Central and South America, analyzing the pooled sample as though it were a single population. Several striking findings emerge. (1) Mean heterozygosity is not sensitive to agglomeration, but the number of different alleles (allele count) is inflated, relative to neutral mutation/drift/equilibrium expectation. (2) The inflation is most serious for rare alleles, especially those which originally occurred as tribally restricted "private" polymorphisms. (3) The degree of inflation is an increasing function of both the number of populations encompassed by the sample and of the genetic divergence among them. (4) Treating an agglomerated population as though it were a panmictic unit of long standing can lead to serious biases in estimates of mutation rates, selection pressures, and effective population sizes. Current DNA studies indicate the presence of numerous genetic variants in human populations. The findings and conclusions of this paper are all fully applicable to the study of genetic variation at the DNA level as well.
Resumo:
Extremes of electrocardiographic QT interval are associated with increased risk for sudden cardiac death (SCD); thus, identification and characterization of genetic variants that modulate QT interval may elucidate the underlying etiology of SCD. Previous studies have revealed an association between a common genetic variant in NOS1AP and QT interval in populations of European ancestry, but this finding has not been extended to other ethnic populations. We sought to characterize the effects of NOS1AP genetic variants on QT interval in the multi-ethnic population-based Dallas Heart Study (DHS, n = 3,072). The SNP most strongly associated with QT interval in previous samples of European ancestry, rs16847548, was the most strongly associated in White (P = 0.005) and Black (P = 3.6 x 10(-5)) participants, with the same direction of effect in Hispanics (P = 0.17), and further showed a significant SNP x sex-interaction (P = 0.03). A second SNP, rs16856785, uncorrelated with rs16847548, was also associated with QT interval in Blacks (P = 0.01), with qualitatively similar results in Whites and Hispanics. In a previously genotyped cohort of 14,107 White individuals drawn from the combined Atherosclerotic Risk in Communities (ARIC) and Cardiovascular Health Study (CHS) cohorts, we validated both the second locus at rs16856785 (P = 7.63 x 10(-8)), as well as the sex-interaction with rs16847548 (P = 8.68 x 10(-6)). These data extend the association of genetic variants in NOS1AP with QT interval to a Black population, with similar trends, though not statistically significant at P<0.05, in Hispanics. In addition, we identify a strong sex-interaction and the presence of a second independent site within NOS1AP associated with the QT interval. These results highlight the consistent and complex role of NOS1AP genetic variants in modulating QT interval.
Resumo:
Recent attempts to detect mutations involving single base changes or small deletions that are specific to genetic diseases provide an opportunity to develop a two-tier mutation-screening program through which incidence of rare genetic disorders and gene carriers may be precisely estimated. A two-tier survey consists of mutation screening in a sample of patients with specific genetic disorders and in a second sample of newborns from the same population in which mutation frequency is evaluated. We provide the statistical basis for evaluating the incidence of affected and gene carriers in such two-tier mutation-screening surveys, from which the precision of the estimates is derived. Sample-size requirements of such two-tier mutation-screening surveys are evaluated. Considering examples of cystic fibrosis (CF) and medium-chain acyl-CoA dehydrogenase deficiency (MCAD), the two most frequent autosomal recessive disease in Caucasian populations and the two most frequent mutations (delta F508 and G985) that occur on these disease allele-bearing chromosomes, we show that, with 50-100 patients and a 20-fold larger sample of newborns screened for these mutations, the incidence of such diseases and their gene carriers in a population may be quite reliably estimated. The theory developed here is also applicable to rare autosomal dominant diseases for which disease-specific mutations are found.
Resumo:
DNA sequence variation is currently a major source of data for studying human origins, evolution, and demographic history, and for detecting linkage association of complex diseases. In this dissertation, I investigated DNA variation in worldwide populations from two ∼10 kb autosomal regions on 22q11.2 (noncoding) and 1q24 (introns). A total of 75 variant sites were found among 128 human sequences in the 22q11.2 region, yielding an estimate of 0.088% for nucleotide diversity (π), and a total of 52 variant sites were found among 122 human sequences in the 1q24 region with an estimated π value of 0.057%. The data from these two regions and a 10 kb noncoding region on Xq13.3 all show a strong excess of low-frequency variants in comparison to that expected from an equilibrium population, indicating a relatively recent population expansion. The effective population sizes estimated from the three regions were 11,000, 12,700, and 8,600, respectively, which are close to the commonly used value of 10,000. In each of the two autosomal regions, the age of the most recent common ancestor (MRCA) was estimated to be older than 1 million years among all the sequences and ∼600,000 years among non-African sequences, providing first evidence from autosomal noncoding or intronic regions for a genetic history of humans much more ancient than the emergence of modern humans. The ancient genetic history of humans indicates no severe bottleneck during the evolution of humans in the last half million years; otherwise, much of the ancient genetic history would have been lost during a severe bottleneck. This study strongly suggests that both the “out of Africa” and the multiregional models are too simple for explaining the evolution of modern humans. A compilation of genome-wide data revealed that nucleotide diversity is highest in autosomal regions, intermediate in X-linked regions, and lowest in Y-linked regions. The data suggest the existence of background selection or selective sweep on Y-linked loci. In general, the nucleotide diversity in humans is low compared to that in chimpanzee and Drosophila populations. ^
Resumo:
Linkage disequilibrium (LD) is defined as the nonrandom association of alleles at two or more loci in a population and may be a useful tool in a diverse array of applications including disease gene mapping, elucidating the demographic history of populations, and testing hypotheses of human evolution. However, the successful application of LD-based approaches to pertinent genetic questions is hampered by a lack of understanding about the forces that mediate the genome-wide distribution of LD within and between human populations. Delineating the genomic patterns of LD is a complex task that will require interdisciplinary research that transcends traditional scientific boundaries. The research presented in this dissertation is predicated upon the need for interdisciplinary studies and both theoretical and experimental projects were pursued. In the theoretical studies, I have investigated the effect of genotyping errors and SNP identification strategies on estimates of LD. The primary importance of these two chapters is that they provide important insights and guidance for the design of future empirical LD studies. Furthermore, I analyzed the allele frequency distribution of 26,530 single nucleotide polymorphisms (SNPs) in three populations and generated the first-generation natural selection map of the human genome, which will be an important resource for explaining and understanding genomic patterns of LD. Finally, in the experimental study, I describe a novel and simple, low-cost, and high-throughput SNP genotyping method. The theoretical analyses and experimental tools developed in this dissertation will facilitate a more complete understanding of patterns of LD in human populations. ^
Resumo:
Apolipoprotein E (ApoE) plays a major role in the metabolism of high density and low density lipoproteins (HDL and LDL). Its common protein isoforms (E2, E3, E4) are risk factors for coronary artery disease (CAD) and explain between 16 to 23% of the inter-individual variation in plasma apoE levels. Linkage analysis has been completed for plasma apoE levels in the GENOA study (Genetic Epidemiology Network of Atherosclerosis). After stratification of the population by lipoprotein levels and body mass index (BMI) to create more homogeneity with regard to biological context for apoE levels, Hispanic families showed significant linkage on chromosome 17q for two strata (LOD=2.93 at 104 cM for a low cholesterol group, LOD=3.04 at 111 cM for a low cholesterol, high HDLC group). Replication of 17q linkage was observed for apoB and apoE levels in the unstratified Hispanic and African-American populations, and for apoE levels in African-American families. Replication of this 17q linkage in different populations and strata provides strong support for the presence of gene(s) in this region with significant roles in the determination of inter-individual variation in plasma apoE levels. Through a positional and functional candidate gene approach, ten genes were identified in the 17q linked region, and 62 polymorphisms in these genes were genotyped in the GENOA families. Association analysis was performed with FBAT, GEE, and variance-component based tests followed by conditional linkage analysis. Association studies with partial coverage of TagSNPs in the gene coding for apolipoprotein H (APOH) were performed, and significant results were found for 2 SNPs (APOH_20951 and APOH_05407) in the Hispanic low cholesterol strata accounting for 3.49% of the inter-individual variation in plasma apoE levels. Among the other candidate genes, we identified a haplotype block in the ACE1 gene that contains two major haplotypes associated with apoE levels as well as total cholesterol, apoB and LDLC levels in the unstratified Hispanic population. Identifying genes responsible for the remaining 60% of inter-individual variation in plasma apoE level, will yield new insights into the understanding of genetic interactions involved in the lipid metabolism, and a more precise understanding of the risk factors leading to CAD. ^
Resumo:
Orosomucoid (ORM) or alpha-1 acid glycoprotein is an acute phase protein of human plasma whose function is suggested to be the competitive inhibition of cellular recognition by infective agents. Isoelectric focusing (IEF) and immunoblotting have been combined and optimum conditions have been determined for reliable classification of different ORM phenotypes. Addition of 6 M urea in an IEF gel revealed additional microheterogeneity in the ORM system which has not been previously reported. 1,667 individuals from different native ethnic groups of North and South America, Africa and New Guinea have been screened to determine the distribution of ORM alleles. Two common alleles, ORM1*1 and ORM1*2 have been observed and their frequencies were determined. Genetically independent variation consistent with expression of the ORM2 locus was observed in American and African blacks but was not observed in other sampled populations. The population allele frequencies for this new locus were 0.958, 0.025, 0.006, 0.011, for alleles ORM2*1, ORM2*2, ORM2*3, ORM2*4, respectively. Family studies confirm the autosomal codominant inheritance of the phenotypes observed at both ORM loci. ^
Resumo:
Diabetes mellitus occurs in two forms, insulin-dependent (IDDM, formerly called juvenile type) and non-insulin dependent (NIDDM, formerly called adult type). Prevalence figures from around the world for NIDDM, show that all societies and all races are affected; although uncommon in some populations (.4%), it is common (10%) or very common (40%) in others (Tables 1 and 2).^ In Mexican-Americans in particular, the prevalence rates (7-10%) are intermediate to those in Caucasians (1-2%) and Amerindians (35%). Information about the distribution of the disease and identification of high risk groups for developing glucose intolerance or its vascular manifestations by the study of genetic markers will help to clarify and solve some of the problems from the public health and the genetic point of view.^ This research was designed to examine two general areas in relation to NIDDM. The first aims to determine the prevalence of polymorphic genetic markers in two groups distinguished by the presence or absence of diabetes and to observe if there are any genetic marker-disease association (univariate analysis using two by two tables and logistic regression to study the individual and joint effects of the different variables). The second deals with the effect of genetic differences on the variation in fasting plasma glucose and percent glycosylated hemoglobin (HbAl) (analysis of Covariance for each marker, using age and sex as covariates).^ The results from the first analysis were not statistically significant at the corrected p value of 0.003 given the number of tests that were performed. From the analysis of covariance of all the markers studied, only Duffy and Phosphoglucomutase were statistically significant but poor predictors, given that the amount they explain in terms of variation in glycosylated hemoglobin is very small.^ Trying to determine the polygenic component of chronic disease is not an easy task. This study confirms the fact that a larger and random or representative sample is needed to be able to detect differences in the prevalence of a marker for association studies and in the genetic contribution to the variation in glucose and glycosylated hemoglobin. The importance that ethnic homogeneity in the groups studied and standardization in the methodology will have on the results has been stressed. ^
Resumo:
Atherosclerosis is widely accepted as a complex genetic phenotype and is the usual cause of cardiovascular disease, the world’s leading killer. Genetic factors have been proven to be important risk contributors for atherosclerosis and much work has been done to identify promising candidates that might play a role in the development of atherosclerosis. It is well known that many independent replications are needed to unequivocally establish a valid genotype-phenotype association across different populations before the findings are extended to clinical settings and to the expensive follow-up studies designed to identify causal genetic variants. Aiming to replicate the association with atherosclerosis in the Pathobiological Determinants of Atherosclerosis in Youth (PDAY) study, we assessed the relationship of 32 atherosclerosis candidate SNPs to atherosclerosis in the PDAY cohort, consisting of AA and EA young people aged 15-34 years who died of non-medical causes. Two association studies, a whole sample study and a 1:1 matched case control study were performed by use of multiple linear regression and logistic regression analyses, respectively. For the whole sample association study, 32 SNPs among 2,650 individuals (1,369 AA and 1,281 EA) were tested for the association with six early atherosclerosis phenotypes: abdominal aorta fatty streaks, abdominal aorta raised lesions, right coronary artery fatty streaks, right coronary artery raised lesions, thoracic aorta fatty streaks, and thoracic aorta raised lesions. For the matched case-control association study, 337 case-control paired samples were included; cases were chosen with the highest total raised lesion scores from the studied population, while controls were randomly selected from individuals that had no raised lesions and matched to cases by age, gender and race. Sixteen SNPs in 13 genes were found to be significantly associated with atherosclerosis in at least one of the PDAY association studies. Among these 16 findings: eight SNPs (rs9579646, rs6053733, rs3849150, rs10499903, rs2148079, rs5073691, rs10116277, and rs17228212) successfully replicated previous results, six SNPs (rs17222814, rs10811661, rs7028570, rs7291467, rs16996148 and rs10401969) were reported as new findings exclusive to our study, the last two of the 16 SNPs, rs501120 and rs6922269, showed either intriguing or conflicting result. SNP rs17222814 in ALOX5AP and SNP rs3849150 in LRRC18 were consistently associated with atherosclerosis in both prior and the two PDAY association studies. SNP rs3849150 was also identified to be highly correlated with a non-synonymous coding SNP, rs17772611, which may damage the protein (polyphen score = 0.996), suggesting that SNP rs17772611 may be the causal functional variant.^ In conclusion, our study added more support for the association of these candidate genes with atherosclerosis. SNPs rs3849150 and rs17772611 of LRRC18, as well as SNP rs17222814 of ALOX5AP, were the most significant findings from our study, and may be ranked among the best for further study.^
Resumo:
Lung cancer is the leading cause of cancer-related mortality in the US. Emerging evidence has shown that host genetic factors can interact with environmental exposures to influence patient susceptibility to the diseases as well as clinical outcomes, such as survival and recurrence. We aimed to identify genetic prognostic markers for non-small cell lung cancer (NSCLC), a major (85%) subtype of lung cancer, and also in other subgroups. With the fast evolution of genotyping technology, genetic association studies have went through candidate gene approach, to pathway-based approach, to the genome wide association study (GWAS). Even in the era of GWAS, pathway-based approach has its own advantages on studying cancer clinical outcomes: it is cost-effective, requiring a smaller sample size than GWAS easier to identify a validation population and explore gene-gene interactions. In the current study, we adopted pathway-based approach focusing on two critical pathways - miRNA and inflammation pathways. MicroRNAs (miRNA) post-transcriptionally regulate around 30% of human genes. Polymorphisms within miRNA processing pathways and binding sites may influence patients’ prognosis through altered gene regulation. Inflammation plays an important role in cancer initiation and progression, and also has shown to impact patients’ clinical outcomes. We first evaluated 240 single nucleotide polymorphisms (SNPs) in miRNA biogenesis genes and predicted binding sites in NSCLC patients to determine associations with clinical outcomes in early-stage (stage I and II) and late-stage (stage III and IV) lung cancer patients, respectively. First, in 535 early-stage patients, after correcting multiple comparisons, FZD4:rs713065 (hazard ratio [HR]:0.46, 95% confidence interval [CI]:0.32-0.65) showed a significant inverse association with survival in early stage surgery-only patients. SP1:rs17695156 (HR:2.22, 95% CI:1.44-3.41) and DROSHA:rs6886834 (HR:6.38, 95% CI:2.49-16.31) conferred increased risk of progression in the all patients and surgery-only populations, respectively. FAS:rs2234978 was significantly associated with improved survival in all patients (HR:0.59, 95% CI:0.44-0.77) and in the surgery plus chemotherapy populations (HR:0.19, 95% CI:0.07-0.46).. Functional genomics analysis demonstrated that this variant creates a miR-651 binding site resulting in altered miRNA regulation of FAS, providing biological plausibility for the observed association. We then analyzed these associations in 598 late-stage patients. After multiple comparison corrections, no SNPs remained significant in the late stage group, while the top SNP NAT1:rs15561 (HR=1.98, 96%CI=1.32-2.94) conferred a significantly increased risk of death in the chemotherapy subgroup. To test the hypothesis that genetic variants in the inflammation-related pathways may be associated with survival in NSCLC patients, we first conducted a three-stage study. In the discovery phase, we investigated a comprehensive panel of 11,930 inflammation-related SNPs in three independent lung cancer populations. A missense SNP (rs2071554) in HLA-DOB was significantly associated with poor survival in the discovery population (HR: 1.46, 95% CI: 1.02-2.09), internal validation population (HR: 1.51, 95% CI: 1.02-2.25), and external validation (HR: 1.52, 95% CI: 1.01-2.29) population. Rs2900420 in KLRK1 was significantly associated with a reduced risk for death in the discovery (HR: 0.76, 95% CI: 0.60-0.96) and internal validation (HR: 0.77, 95% CI: 0.61-0.99) populations, and the association reached borderline significance in the external validation population (HR: 0.80, 95% CI: 0.63-1.02). We also evaluated these inflammation-related SNPs in NSCLC patients in never smokers. Lung cancer in never smokers has been increasingly recognized as distinct disease from that in ever-smokers. A two-stage study was performed using a discovery population from MD Anderson (411 patients) and a validation population from Mayo Clinic (311 patients). Three SNPs (IL17RA:rs879576, BMP8A:rs698141, and STK:rs290229) that were significantly associated with survival were validated (pCD74:rs1056400 and CD38:rs10805347) were borderline significant (p=0.08) in the Mayo Clinic population. In the combined analysis, IL17RA:rs879576 resulted in a 40% reduction in the risk for death (p=4.1 × 10-5 [p=0.61, heterogeneity test]). We also validated a survival tree created in MD Anderson population in the Mayo Clinic population. In conclusion, our results provided strong evidence that genetic variations in specific pathways that examined (miRNA and inflammation pathways) influenced clinical outcomes in NSCLC patients, and with further functional studies, the novel loci have potential to be translated into clinical use.