20 resultados para Allele frequency data
Resumo:
Recent studies indicate that polymorphic genetic markers are potentially helpful in resolving genealogical relationships among individuals in a natural population. Genetic data provide opportunities for paternity exclusion when genotypic incompatibilities are observed among individuals, and the present investigation examines the resolving power of genetic markers in unambiguous positive determination of paternity. Under the assumption that the mother for each offspring in a population is unambiguously known, an analytical expression for the fraction of males excluded from paternity is derived for the case where males and females may be derived from two different gene pools. This theoretical formulation can also be used to predict the fraction of births for each of which all but one male can be excluded from paternity. We show that even when the average probability of exclusion approaches unity, a substantial fraction of births yield equivocal mother-father-offspring determinations. The number of loci needed to increase the frequency of unambiguous determinations to a high level is beyond the scope of current electrophoretic studies in most species. Applications of this theory to electrophoretic data on Chamaelirium luteum (L.) shows that in 2255 offspring derived from 273 males and 70 females, only 57 triplets could be unequivocally determined with eight polymorphic protein loci, even though the average combined exclusionary power of these loci was 73%. The distribution of potentially compatible male parents, based on multilocus genotypes, was reasonably well predicted from the allele frequency data available for these loci. We demonstrate that genetic paternity analysis in natural populations cannot be reliably based on exclusionary principles alone. In order to measure the reproductive contributions of individuals in natural populations, more elaborate likelihood principles must be deployed.
Resumo:
I studied the apolipoprotein (apo) B 3$\sp\prime$ variable number tandem repeat (VNTR) and did computer simulations of the stepwise mutation model to address four questions: (1) How did the apo B VNTR originate? (2) What is the mutational mechanism of repeat number change at the apo B VNTR? (3) To what extent are population and molecular level events responsible for the determination of the contemporary apo B allele frequency distribution? (4) Can VNTR allele frequency distributions be explained by a simple and conservative mutation-drift model? I used three general approaches to address these questions: (1) I characterized the apo B VNTR region in non-human primate species; (2) I constructed haplotypes of polymorphic markers flanking the apo B VNTR in a sample of individuals from Lorrain, France and studied the associations between the flanking-marker haplotypes and apo B VNTR size; (3) I did computer simulations of the one-step stepwise mutation model and compared the results to real data in terms of four allele frequency distribution characteristics.^ The results of this work have allowed me to conclude that the apo B VNTR originated after an initial duplication of a sequence which is still present as a single copy sequence in New World monkey species. I conclude that this locus did not originate by the transposition of an array of repeats from somewhere else in the genome. It is unlikely that recombination is the primary mutational mechanism. Furthermore, the clustered nature of these associations implicates a stepwise mutational mechanism. From the high frequencies of certain haplotype-allele size combinations, it is evident that population level events have also been important in the determination of the apo B VNTR allele frequency distribution. Results from computer simulations of the one-step stepwise mutation model have allowed me to conclude that bimodal and multimodal allele frequency distributions are not unexpected at loci evolving via stepwise mutation mechanisms. Short tandem repeat loci fit the stepwise mutation model best, followed by microsatellite loci. I therefore conclude that there are differences in the mutational mechanisms of VNTR loci as classed by repeat unit size. (Abstract shortened by UMI.) ^
Resumo:
Complete NotI, SfiI, XbaI and BlnI cleavage maps of Escherichia coli K-12 strain MG1655 were constructed. Techniques used included: CHEF pulsed field gel electrophoresis; transposon mutagenesis; fragment hybridization to the ordered $\lambda$ library of Kohara et al.; fragment and cosmid hybridization to Southern blots; correlation of fragments and cleavage sites with EcoMap, a sequence-modified version of the genomic restriction map of Kohara et al.; and correlation of cleavage sites with DNA sequence databases. In all, 105 restriction sites were mapped and correlated with the EcoMap coordinate system.^ NotI, SfiI, XbaI and BlnI restriction patterns of five commonly used E. coli K-12 strains were compared to those of MG1655. The variability between strains, some of which are separated by numerous steps of mutagenic treatment, is readily detectable by pulsed-field gel electrophoresis. A model is presented to account for the difference between the strains on the basis of simple insertions, deletions, and in one case an inversion. Insertions and deletions ranged in size from 1 kb to 86 kb. Several of the larger features have previously been characterized and some of the smaller rearrangements can potentially account for previously reported genetic features of these strains.^ Some aspects of the frequency and distribution of NotI, SfiI, XbaI and BlnI cleavage sites were analyzed using a method based on Markov chain theory. Overlaps of Dam and Dcm methylase sites with XbaI and SfiI cleavage sites were examined. The one XbaI-Dam overlap in the database is in accord with the expected frequency of this overlap. The occurrence of certain types of SfiI-Dcm overlaps are overrepresented. Of the four subtypes of SfiI-Dcm overlap, only one has a partial inhibitory effect on the activity of SfiI. Recognition sites for all four enzymes are rarer than expected based on oligonucleotide frequency data, with this effect being much stronger for XbaI and BlnI than for NotI and SfiI. The latter two enzyme sites are rare mainly due to apparent negative selection against GGCC (both) and CGGCCG (NotI). The former two enzyme sites are rare mainly due to effects of the VSP repair system on certain di-tri- and tetranucleotides, most notably CTAG. Models are proposed to explain several of the anomalies of oligonucleotide distribution in E. coli, and the biological significance of the systems that produce these anomalies is discussed. ^
Resumo:
Variable number of tandem repeats (VNTR) are genetic loci at which short sequence motifs are found repeated different numbers of times among chromosomes. To explore the potential utility of VNTR loci in evolutionary studies, I have conducted a series of studies to address the following questions: (1) What are the population genetic properties of these loci? (2) What are the mutational mechanisms of repeat number change at these loci? (3) Can DNA profiles be used to measure the relatedness between a pair of individuals? (4) Can DNA fingerprint be used to measure the relatedness between populations in evolutionary studies? (5) Can microsatellite and short tandem repeat (STR) loci which mutate stepwisely be used in evolutionary analyses?^ A large number of VNTR loci typed in many populations were studied by means of statistical methods developed recently. The results of this work indicate that there is no significant departure from Hardy-Weinberg expectation (HWE) at VNTR loci in most of the human populations examined, and the departure from HWE in some VNTR loci are not solely caused by the presence of population sub-structure.^ A statistical procedure is developed to investigate the mutational mechanisms of VNTR loci by studying the allele frequency distributions of these loci. Comparisons of frequency distribution data on several hundreds VNTR loci with the predictions of two mutation models demonstrated that there are differences among VNTR loci grouped by repeat unit sizes.^ By extending the ITO method, I derived the distribution of the number of shared bands between individuals with any kinship relationship. A maximum likelihood estimation procedure is proposed to estimate the relatedness between individuals from the observed number of shared bands between them.^ It was believed that classical measures of genetic distance are not applicable to analysis of DNA fingerprints which reveal many minisatellite loci simultaneously in the genome, because the information regarding underlying alleles and loci is not available. I proposed a new measure of genetic distance based on band sharing between individuals that is applicable to DNA fingerprint data.^ To address the concern that microsatellite and STR loci may not be useful for evolutionary studies because of the convergent nature of their mutation mechanisms, by a theoretical study as well as by computer simulation, I conclude that the possible bias caused by the convergent mutations can be corrected, and a novel measure of genetic distance that makes the correction is suggested. In summary, I conclude that hypervariable VNTR loci are useful in evolutionary studies of closely related populations or species, especially in the study of human evolution and the history of geographic dispersal of Homo sapiens. (Abstract shortened by UMI.) ^
Resumo:
Background. Obesity is a major health problem throughout the industrialized world. Despite numerous attempts to curtail the rapid growth of obesity, its incidence continues to rise. Therefore, it is crucial to better understand the etiology of obesity beyond the concept of energy balance.^ Aims. The first aim of this study was to first investigate the relationship between eating behaviors and body size. The second goal was to identify genetic variation associated with eating behaviors. Thirdly, this study aimed to examine the joint relationships between eating behavior, body size and genetic variation.^ Methods. This study utilized baseline data ascertained in young adults from the Training Interventions and Genetics of Exercise (TIGER) Study. Variables assessed included eating behavior (Emotional Eating Scale, Eating Attitudes Test-26, and the Block98 Food Frequency Questionnaire), body size (body mass index, waist and hip circumference, waist/hip ratio, and percent body fat), genetic variation in genes implicated related to the hypothalamic control of energy balance, and appropriate covariates (age, gender, race/ethnicity, smoking status, and physical activity. For the genetic association analyses, genotypes were collapsed by minor allele frequency, and haplotypes were estimated for each gene. Additionally, Bayesian networks were constructed in order to determine the relationships between genetic variation, eating behavior and body size.^ Results. We report that the EAT-26 score, Caloric intake, percent fat, fiber intake, HEAT index, and daily servings of vegetables, meats, grains, and fats were significantly associated with at least one body size measure. Multiple SNPs in 17 genes and haplotypes from 12 genes were tested for their association with body size. Variation within both DRD4 and HTR2A was found to be associated with EAT-26 score. In addition, variation in the ghrelin gene (GHRL) was significantly associated with daily Caloric intake. A significant interaction between daily servings of grains and the HEAT index and variation within the leptin receptor gene (LEPR) was shown to influence body size.^ Conclusion. This study has shown that there is a substantial genetic component to eating behavior and that genetic variation interacts with eating behavior to influence body size.^
Resumo:
The distribution of the number of heterozygous loci in two randomly chosen gametes or in a random diploid zygote provides information regarding the nonrandom association of alleles among different genetic loci. Two alternative statistics may be employed for detection of nonrandom association of genes of different loci when observations are made on these distributions: observed variance of the number of heterozygous loci (s2k) and a goodness-of-fit criterion (X2) to contrast the observed distribution with that expected under the hypothesis of random association of genes. It is shown, by simulation, that s2k is statistically more efficient than X2 to detect a given extent of nonrandom association. Asymptotic normality of s2k is justified, and X2 is shown to follow a chi-square (chi 2) distribution with partial loss of degrees of freedom arising because of estimation of parameters from the marginal gene frequency data. Whenever direct evaluations of linkage disequilibrium values are possible, tests based on maximum likelihood estimators of linkage disequilibria require a smaller sample size (number of zygotes or gametes) to detect a given level of nonrandom association in comparison with that required if such tests are conducted on the basis of s2k. Summarization of multilocus genotype (or haplotype) data, into the different number of heterozygous loci classes, thus, amounts to appreciable loss of information.
Resumo:
Linkage disequilibrium methods can be used to find genes influencing quantitative trait variation in humans. Linkage disequilibrium methods can require smaller sample sizes than linkage equilibrium methods, such as the variance component approach to find loci with a specific effect size. The increase in power is at the expense of requiring more markers to be typed to scan the entire genome. This thesis compares different linkage disequilibrium methods to determine which factors influence the power to detect disequilibrium. The costs of disequilibrium and equilibrium tests were compared to determine whether the savings in phenotyping costs when using disequilibrium methods outweigh the additional genotyping costs.^ Nine linkage disequilibrium tests were examined by simulation. Five tests involve selecting isolated unrelated individuals while four involved the selection of parent child trios (TDT). All nine tests were found to be able to identify disequilibrium with the correct significance level in Hardy-Weinberg populations. Increasing linked genetic variance and trait allele frequency were found to increase the power to detect disequilibrium, while increasing the number of generations and distance between marker and trait loci decreased the power to detect disequilibrium. Discordant sampling was used for several of the tests. It was found that the more stringent the sampling, the greater the power to detect disequilibrium in a sample of given size. The power to detect disequilibrium was not affected by the presence of polygenic effects.^ When the trait locus had more than two trait alleles, the power of the tests maximized to less than one. For the simulation methods used here, when there were more than two-trait alleles there was a probability equal to 1-heterozygosity of the marker locus that both trait alleles were in disequilibrium with the same marker allele, resulting in the marker being uninformative for disequilibrium.^ The five tests using isolated unrelated individuals were found to have excess error rates when there was disequilibrium due to population admixture. Increased error rates also resulted from increased unlinked major gene effects, discordant trait allele frequency, and increased disequilibrium. Polygenic effects did not affect the error rates. The TDT, Transmission Disequilibrium Test, based tests were not liable to any increase in error rates.^ For all sample ascertainment costs, for recent mutations ($<$100 generations) linkage disequilibrium tests were less expensive than the variance component test to carry out. Candidate gene scans saved even more money. The use of recently admixed populations also decreased the cost of performing a linkage disequilibrium test. ^
Resumo:
Obesity and related chronic diseases represent a tremendous public health burden among Mexican Americans, a young and rapidly-expanding population. This study investigated the impact of variation within eight candidate obesity genes, which include leptin (LEP), leptin receptor (LEPR), neuropeptide Y (NPY), NPYY1 receptor (NPYY1), glucagon-like peptide-1 (GLP-1), GLP-1 receptor (GLP1R), beta-3 adrenergic receptor (β3AR), and uncoupling protein (UCP1), on variation in human obesity status and/or quantitative traits related to obesity in Mexican Americans from Starr County, Texas. The Trp64Arg polymorphism within β3AR was typed in 820 random individuals and 240 pedigrees (N = 2,044). The Arg allele frequency was significantly greater in obese versus non-obese individuals (0.20 versus 0. 15, respectively). In addition, within the random sample, the Arg allele was associated with significantly greater body weight (p = 0.031) and body mass index (BMI, p = 0.008) than the Trp allele. In the family sample, the Trp64Arg locus was also linked to percent fat (p = 0.045) but not to body weight or BMI. No linkage between obesity, diabetes, hypertension, or gallbladder disease and the Trp64Arg mutation was observed in families using affected sib pair linkage analysis or the transmission disequilibrium test. Microsatellite markers proximate to the remaining seven genes were typed in 302 individuals from 59 families. Sib pair linkage analysis provided evidence for linkage between obesity and NPY within affected sibling pairs (p = 0.042; n = 170 pairs). NPY was also linked to weight (p = 0.020), abdominal circumference (p = 0.031), hip circumference (p = 0.012), DBP (p ≤ 0.005), and a composite measure of body mass/fat (p ≤ 0.048) in all sibling pairs (n = 545 pairs). Additionally, LEP was linked to waist/hip ratio (p ≤ 0.009), total cholesterol (p ≤ 0.030), and HDL cholesterol (p ≤ 0.026), and LEPR was linked to fasting blood glucose (p ≤ 0.018) and DBP (p ≤ 0.003). Subsequent to the linkage analyses, the NPY gene was sequenced and eight variant sites identified. Two variant sites (-880I/D and 69I/D) were typed in a random sample of 914 individuals. The 880I/D variant was significantly associated with waist/hip ratio (p = 0.035) in the entire sample (N = 914) and with BMI (p = 0. 031), abdominal circumference (p = 0.044), and waist/hip ratio (p = 0.041) in a non-obese subsample (BW < 30 kg/m2, n = 594). The 69I/D variant was a rare mutation observed in only one pedigree and was not associated with obesity or body size/mass within this pedigree. Results of this study indicate that variation at or near β3AR, LEP, LEPR, and NPY may exert effects which increase obesity susceptibility and influence obesity-related measures in this population. ^
Resumo:
Linkage disequilibrium (LD) is defined as the nonrandom association of alleles at two or more loci in a population and may be a useful tool in a diverse array of applications including disease gene mapping, elucidating the demographic history of populations, and testing hypotheses of human evolution. However, the successful application of LD-based approaches to pertinent genetic questions is hampered by a lack of understanding about the forces that mediate the genome-wide distribution of LD within and between human populations. Delineating the genomic patterns of LD is a complex task that will require interdisciplinary research that transcends traditional scientific boundaries. The research presented in this dissertation is predicated upon the need for interdisciplinary studies and both theoretical and experimental projects were pursued. In the theoretical studies, I have investigated the effect of genotyping errors and SNP identification strategies on estimates of LD. The primary importance of these two chapters is that they provide important insights and guidance for the design of future empirical LD studies. Furthermore, I analyzed the allele frequency distribution of 26,530 single nucleotide polymorphisms (SNPs) in three populations and generated the first-generation natural selection map of the human genome, which will be an important resource for explaining and understanding genomic patterns of LD. Finally, in the experimental study, I describe a novel and simple, low-cost, and high-throughput SNP genotyping method. The theoretical analyses and experimental tools developed in this dissertation will facilitate a more complete understanding of patterns of LD in human populations. ^
Resumo:
The interpretation of data on genetic variation with regard to the relative roles of different evolutionary factors that produce and maintain genetic variation depends critically on our assumptions concerning effective population size and the level of migration between neighboring populations. In humans, recent population growth and movements of specific ethnic groups across wide geographic areas mean that any theory based on assumptions of constant population size and absence of substructure is generally untenable. We examine the effects of population subdivision on the pattern of protein genetic variation in a total sample drawn from an artificial agglomerate of 12 tribal populations of Central and South America, analyzing the pooled sample as though it were a single population. Several striking findings emerge. (1) Mean heterozygosity is not sensitive to agglomeration, but the number of different alleles (allele count) is inflated, relative to neutral mutation/drift/equilibrium expectation. (2) The inflation is most serious for rare alleles, especially those which originally occurred as tribally restricted "private" polymorphisms. (3) The degree of inflation is an increasing function of both the number of populations encompassed by the sample and of the genetic divergence among them. (4) Treating an agglomerated population as though it were a panmictic unit of long standing can lead to serious biases in estimates of mutation rates, selection pressures, and effective population sizes. Current DNA studies indicate the presence of numerous genetic variants in human populations. The findings and conclusions of this paper are all fully applicable to the study of genetic variation at the DNA level as well.
Resumo:
Nonpapillary renal cell carcinoma (RCC) is an adult cancer of the kidney which occurs both in familial and sporadic forms. The familial form of RCC is associated with translocations involving chromosome 3 with a breakpoint at 3p14-p13. Studies focused on sporadic RCC have shown two commonly deleted regions at 3p14.3-p13 and 3p21.3. In addition, a more distal region mapping to 3p26-p25 has been linked to the Von Hippel Lindau (VHL) disease gene. A large proportion of VHL patients develop RCC. The short arm of human chromosome 3 can, therefore, be dissected into three distinct regions which could encode tumor suppressor genes for RCC. Loss or inactivation of one or more of these loci may be an important step in the genesis of RCC.^ I have used the technique of microcell-mediated chromosome transfer to introduce an intact, normal human chromosome 3 and defined fragments of 3p, dominantly marked with pSV2neo, into the highly malignant RCC cell line SN12C.19. The introduction of chromosome 3 and of a centric fragment of 3p, encompassing 3p14-q11, into SN12C.19 resulted in dramatic suppression of tumor growth in nude mice. Another defined deletion hybrid contained the region 3p12-q24 of the introduced human chromosome and failed to suppress tumorigenicity. These data define the region 3p14-p12, the most proximal region of high frequency allele loss in sporadic RCC as well as the region containing the translocation breakpoint in familial RCC, to contain a novel tumor suppressor locus involved in RCC. We have designated this locus nonpapillary renal cell carcinoma-1 (NRC-1). Furthermore, we have functional evidence that NRC-1 controls the growth of RCC cells by inducing rapid cell death in vivo. ^
Resumo:
Prostate cancer (PC) is a significant economic and health burden in the U.S. and Europe but its causes are largely unknown. The most significant risk factors (after gender) are age and family history of the disease. A gene with high penetrance but low frequency on chromosome 1q, HPC 1, has been suggested to cause a proportion of the familial aggregation of PC but other more common genes, conferring less risk, are also thought to contribute to disease predisposition. We have pursued a strategy to study both types of genetic risk in PC. To identify high penetrance genes, affected men from thirteen families have been genotyped for genetic linkage analysis at six microsatellite markers spanning 45 cM of 1q24-25. Both LOD score and non-parametric statistics provide no significant support for HPC1 in this genomic region, although 3 of the families did combine to produce a LOD score of 0.9. These families will be included in a genome wide search for other PC predisposition genes as part of a multinational collaboration.^ For study of common genetic factors in PC development, leukocyte DNA samples from an unselected series of 55 patients and 67 controls have been examined for genetic differences in two other candidate genes, the androgen receptor gene, hAR, at Xq11-12, and the vitamin D receptor gene, hVDR, at 12q12-14. hAR was typed for two trinucleotide repeat length polymorphisms, (CAG)$\rm\sb{n}$ and (GGC)$\rm\sb{n},$ encoding polyglutamine and polyglycine tracts, respectively, which have been implicated in PC susceptibility. These data, combined with similarly processed patients and controls from the U.K. show no consistent association of allele length with PC risk. A novel finding, however, has been a significant association between the number of GGC repeats and the length of time between diagnosis and relapse in stage T1-T4 Caucasian patients irrespective of therapy and age of the patient. Of 49 patients who relapsed out of 108 entering the study, those with 16 or fewer GGC repeats had an average relapse-free-period of 101 (+/$-$7.7) months while for those with more than 16 repeats the period averaged 48 (+/$-$2.9) months, a difference of 2.1 fold or 4.4 years.^ The second gene, hVDR, was genotyped at two polymorphisms, a synonymous C/T substitution in exon 9 identified by differential TaqI enzymatic digestion and a variable length polyA tract in the 3$\sp\prime$ UTR. Although these polymorphisms are in strong linkage disequilibrium only the polyA region showed a possible association with PC risk. Men homozygous for alleles with fewer than 18 A's had an increased risk (OR = 3.0, p = 0.0578) compared to controls. This result is opposite to the findings of others and may either indicate off-setting random errors which together balance out to no significant overall effect or reflect more complex genetic and/or environmental associations.^ Overall, this research suggests that single gene familial predisposition may be less prominent in PC than in other cancers and that the characteristics of PC pathology may be useful in identifying the effects of common genetic factors. ^
Resumo:
Objective. The purpose of this study was to determine the relationship between ethnicity and skin cancer risk perception while controlling for other risk factors: education, gender, age, access to healthcare, family history of skin cancer, fear, and worry. ^ Methods. This study utilized the Health Information National Trends Survey (HINTS) dataset, a nationally representative sample of 5,586 individuals 18 years of age or older. One third of the respondents were chosen at random and asked questions involving skin cancer. Analysis was based on questions that identified skin cancer risk perception, fear of finding skin cancer, and frequency of worry about skin cancer and a variety of sociodemographic factors. ^ Results. Ethnicity had a significant impact on risk perception scores while controlling for other risk factors. Other risk factors that also had a significant impact on risk perception scores included family history of skin cancer, age, and worry. ^
Resumo:
Introduction. Food frequency questionnaires (FFQ) are used study the association between dietary intake and disease. An instructional video may potentially offer a low cost, practical method of dietary assessment training for participants thereby reducing recall bias in FFQs. There is little evidence in the literature of the effect of using instructional videos on FFQ-based intake. Objective. This analysis compared the reported energy and macronutrient intake of two groups that were randomized either to watch an instructional video before completing an FFQ or to view the same instructional video after completing the same FFQ. Methods. In the parent study, a diverse group of students, faculty and staff from Houston Community College were randomized to two groups, stratified by ethnicity, and completed an FFQ. The "video before" group watched an instructional video about completing the FFQ prior to answering the FFQ. The "video after" group watched the instructional video after completing the FFQ. The two groups were compared on mean daily energy (Kcal/day), fat (g/day), protein (g/day), carbohydrate (g/day) and fiber (g/day) intakes using descriptive statistics and one-way ANOVA. Demographic, height, and weight information was collected. Dietary intakes were adjusted for total energy intake before the comparative analysis. BMI and age were ruled out as potential confounders. Results. There were no significant differences between the two groups in mean daily dietary intakes of energy, total fat, protein, carbohydrates and fiber. However, a pattern of higher energy intake and lower fiber intake was reported in the group that viewed the instructional video before completing the FFQ compared to those who viewed the video after. Discussion. Analysis of the difference between reported intake of energy and macronutrients showed an overall pattern, albeit not statistically significant, of higher intake in the video before versus the video after group. Application of instructional videos for dietary assessment may require further research to address the validity of reported dietary intakes in those who are randomized to watch an instructional video before reporting diet compared to a control groups that does not view a video.^