929 resultados para Complex quantitative traits
Resumo:
We propose robust and e±cient tests and estimators for gene-environment/gene-drug interactions in family-based association studies. The methodology is designed for studies in which haplotypes, quantitative pheno- types and complex exposure/treatment variables are analyzed. Using causal inference methodology, we derive family-based association tests and estimators for the genetic main effects and the interactions. The tests and estimators are robust against population admixture and strati¯cation without requiring adjustment for confounding variables. We illustrate the practical relevance of our approach by an application to a COPD study. The data analysis suggests a gene-environment interaction between a SNP in the Serpine gene and smok- ing status/pack years of smoking that reduces the FEV1 volume by about 0.02 liter per pack year of smoking. Simulation studies show that the pro- posed methodology is su±ciently powered for realistic sample sizes and that it provides valid tests and effect size estimators in the presence of admixture and stratification.
Resumo:
My dissertation focuses on developing methods for gene-gene/environment interactions and imprinting effect detections for human complex diseases and quantitative traits. It includes three sections: (1) generalizing the Natural and Orthogonal interaction (NOIA) model for the coding technique originally developed for gene-gene (GxG) interaction and also to reduced models; (2) developing a novel statistical approach that allows for modeling gene-environment (GxE) interactions influencing disease risk, and (3) developing a statistical approach for modeling genetic variants displaying parent-of-origin effects (POEs), such as imprinting. In the past decade, genetic researchers have identified a large number of causal variants for human genetic diseases and traits by single-locus analysis, and interaction has now become a hot topic in the effort to search for the complex network between multiple genes or environmental exposures contributing to the outcome. Epistasis, also known as gene-gene interaction is the departure from additive genetic effects from several genes to a trait, which means that the same alleles of one gene could display different genetic effects under different genetic backgrounds. In this study, we propose to implement the NOIA model for association studies along with interaction for human complex traits and diseases. We compare the performance of the new statistical models we developed and the usual functional model by both simulation study and real data analysis. Both simulation and real data analysis revealed higher power of the NOIA GxG interaction model for detecting both main genetic effects and interaction effects. Through application on a melanoma dataset, we confirmed the previously identified significant regions for melanoma risk at 15q13.1, 16q24.3 and 9p21.3. We also identified potential interactions with these significant regions that contribute to melanoma risk. Based on the NOIA model, we developed a novel statistical approach that allows us to model effects from a genetic factor and binary environmental exposure that are jointly influencing disease risk. Both simulation and real data analyses revealed higher power of the NOIA model for detecting both main genetic effects and interaction effects for both quantitative and binary traits. We also found that estimates of the parameters from logistic regression for binary traits are no longer statistically uncorrelated under the alternative model when there is an association. Applying our novel approach to a lung cancer dataset, we confirmed four SNPs in 5p15 and 15q25 region to be significantly associated with lung cancer risk in Caucasians population: rs2736100, rs402710, rs16969968 and rs8034191. We also validated that rs16969968 and rs8034191 in 15q25 region are significantly interacting with smoking in Caucasian population. Our approach identified the potential interactions of SNP rs2256543 in 6p21 with smoking on contributing to lung cancer risk. Genetic imprinting is the most well-known cause for parent-of-origin effect (POE) whereby a gene is differentially expressed depending on the parental origin of the same alleles. Genetic imprinting affects several human disorders, including diabetes, breast cancer, alcoholism, and obesity. This phenomenon has been shown to be important for normal embryonic development in mammals. Traditional association approaches ignore this important genetic phenomenon. In this study, we propose a NOIA framework for a single locus association study that estimates both main allelic effects and POEs. We develop statistical (Stat-POE) and functional (Func-POE) models, and demonstrate conditions for orthogonality of the Stat-POE model. We conducted simulations for both quantitative and qualitative traits to evaluate the performance of the statistical and functional models with different levels of POEs. Our results showed that the newly proposed Stat-POE model, which ensures orthogonality of variance components if Hardy-Weinberg Equilibrium (HWE) or equal minor and major allele frequencies is satisfied, had greater power for detecting the main allelic additive effect than a Func-POE model, which codes according to allelic substitutions, for both quantitative and qualitative traits. The power for detecting the POE was the same for the Stat-POE and Func-POE models under HWE for quantitative traits.
Resumo:
For complex disease genetics research in human populations, remarkable progress has been made in recent times with the publication of a number of genome-wide association scans (GWAS) and subsequent statistical replications. These studies have identified new genes and pathways implicated in disease, many of which were not known before. Given these early successes, more GWAS are being conducted and planned, both for disease and quantitative phenotypes. Many researchers and clinicians have DNA samples available on collections of families, including both cases and controls. Twin registries around the world have facilitated the collection of large numbers of families, with DNA and multiple quantitative phenotypes collected on twin pairs and their relatives. In the design of a new GWAS with a fixed budget for the number of chips, the question arises whether to include or exclude related individuals. It is commonly believed to be preferable to use unrelated individuals in the first stage of a GWAS because relatives are 'over-matched' for genotypes. In this study, we quantify that for GWAS of a quantitative phenotype, relative to a sample of unrelated individuals surprisingly little power is lost when using relatives. The advantages of using relatives are manifold, including the ability to perform more quality control, the choice to perform within-family tests of association that are robust to population stratification, and the ability to perform joint linkage and association analysis. Therefore, the advantages of using relatives in GWAS for quantitative traits may well outweigh the small disadvantage in terms of statistical power.
Resumo:
Progress in crop improvement is limited by the ability to identify favourable combinations of genotypes (G) and management practices (M) in relevant target environments (E) given the resources available to search among the myriad of possible combinations. To underpin yield advance we require prediction of phenotype based on genotype. In plant breeding, traditional phenotypic selection methods have involved measuring phenotypic performance of large segregating populations in multi-environment trials and applying rigorous statistical procedures based on quantitative genetic theory to identify superior individuals. Recent developments in the ability to inexpensively and densely map/sequence genomes have facilitated a shift from the level of the individual (genotype) to the level of the genomic region. Molecular breeding strategies using genome wide prediction and genomic selection approaches have developed rapidly. However, their applicability to complex traits remains constrained by gene-gene and gene-environment interactions, which restrict the predictive power of associations of genomic regions with phenotypic responses. Here it is argued that crop ecophysiology and functional whole plant modelling can provide an effective link between molecular and organism scales and enhance molecular breeding by adding value to genetic prediction approaches. A physiological framework that facilitates dissection and modelling of complex traits can inform phenotyping methods for marker/gene detection and underpin prediction of likely phenotypic consequences of trait and genetic variation in target environments. This approach holds considerable promise for more effectively linking genotype to phenotype for complex adaptive traits. Specific examples focused on drought adaptation are presented to highlight the concepts.
Resumo:
Obesity is a complex multifactorial disease and is a public health priority. Perilipin coats the surface of lipid droplets in adipocytes and is believed to stabilize these lipid bodies by protecting triglyceride from early lipolysis. This research project evaluated the association between genetic variation within the human perilipin (PLIN) gene and obesity-related quantitative traits and disease-related phenotypes in Non-Hispanic White (NHW) and African American (AA) participants from the Atherosclerosis Risk in Communities (ARIC) Study. ^ Multivariate linear regression, multivariate logistic regression, and Cox proportional hazards models evaluated the association between single gene variants (rs2304794, rs894160, rs8179071, and rs2304795) and multilocus variation (rs894160 and rs2304795) within the PLIN gene and both obesity-related quantitative traits (body weight, body mass index [BMI], waist girth, waist-to-hip ratio [WHR], estimated percent body fat, and plasma total triglycerides) and disease-related phenotypes (prevalent obesity, metabolic syndrome [MetS], prevalent coronary heart disease [CHD], and incident CHD). Single variant analyses were stratified by race and gender within race while multilocus analyses were stratified by race. ^ Single variant analyses revealed that rs2304794 and rs894160 were significantly related to plasma triglyceride levels in all NHWs and NHW women. Among AA women, variant rs8179071 was associated with triglyceride levels and rs2304794 was associated with risk-raising waist circumference (>0.8 in women). The multilocus effects of variants rs894160 and rs2304795 were significantly associated with body weight, waist girth, WHR, estimated percent body fat, class II obesity (BMI ≥ 35 kg/m2), class III obesity (BMI ≥ 35 kg/m2), and risk-raising WHR (>0.9 in men and >0.8 in women) in AAs. Variant rs2304795 was significantly related to prevalent MetS among AA males and prevalent CHD in NHW women; multilocus effects of the PLIN gene were associated with prevalent CHD among NHWs. Rs2304794 was associated with incident CHD in the absence of the MetS among AAs. These findings support the hypothesis that variation within the PLIN gene influences obesity-related traits and disease-related phenotypes. ^ Understanding these effects of the PLIN genotype on the development of obesity can potentially lead to tailored health promotion interventions that are more effective. ^
Resumo:
Population-wide associations between loci due to linkage disequilibrium can be used to map quantitative trait loci (QTL) with high resolution. However, spurious associations between markers and QTL can also arise as a consequence of population stratification. Statistical methods that cannot differentiate between loci associations due to linkage disequilibria from those caused in other ways can render false-positive results. The transmission-disequilibrium test (TDT) is a robust test for detecting QTL. The TDT exploits within-family associations that are not affected by population stratification. However, some TDTs are formulated in a rigid-form, with reduced potential applications. In this study we generalize TDT using mixed linear models to allow greater statistical flexibility. Allelic effects are estimated with two independent parameters: one exploiting the robust within-family information and the other the potentially biased between-family information. A significant difference between these two parameters can be used as evidence for spurious association. This methodology was then used to test the effects of the fourth melanocortin receptor (MC4R) on production traits in the pig. The new analyses supported the previously reported results; i.e., the studied polymorphism is either causal of in very strong linkage disequilibrium with the causal mutation, and provided no evidence for spurious association.
Resumo:
Obesity increases the risk for several conditions, including type 2 diabetes mellitus, cardiovascular disease, hypertension, osteoarthirits and certain types of cancer. Twin- and family studies have shown that there is a major genetic component in the determination of body mass. In recent years several technological and scientific advance have been made in obesity research. For instance, novel replicated loci have been revealed by a number of genome wide association studies. This thesis aimed to investigate the association of genetic factors and obesity-related quantitative traits. The first study investigated the role of the lactase gene in anthropometric traits. We genetically defined lactose persistence by genotyping 31 720 individuals of European descent. We found that lactase persistence was significantly correlated with weight and body mass index but not with height. In the second study we performed the largest whole genome linkage scan for body mass index to date. The sample consisted of 4401 twin families and 10 535 individuals from six European countries. We found supporting evidence for two loci (3q29 and 7q36). We observed that the heritability estimate increased substantially when additional family members were removed from the analyses, which suggests reduced environmental variance in the twin sample. In the third study we assessed metabonomic, transcriptomic and genomic variation in a Finnish population cohort of 518 individuals. We formed gene expression networks to portray pathways and showed that a set of highly correlated genes of an inflammatory pathway associated with 80 serum metabolites (of 134 quantified measures). Strong association was found, for example, with several lipoprotein subclasses. We inferred causality by using genetic variation as anchors. The expression of the network genes was found to be dependent on the circulatory metabolite concentrations.
The use of genetic correlations to evaluate associations between SNP markers and quantitative traits
Resumo:
Open-pollinated progeny of Corymbia citriodora established in replicated field trials were assessed for stem diameter, wood density, and pulp yield prior to genotyping single nucleotide polymorphisms (SNP) and testing the significance of associations between markers and assessment traits. Multiple individuals within each family were genotyped and phenotyped, which facilitated a comparison of standard association testing methods and an alternative method developed to relate markers to additive genetic effects. Narrow-sense heritability estimates indicated there was significant additive genetic variance within this population for assessment traits ( h ˆ 2 =0.28to0.44 ) and genetic correlations between the three traits were negligible to moderate (r G = 0.08 to 0.50). The significance of association tests (p values) were compared for four different analyses based on two different approaches: (1) two software packages were used to fit standard univariate mixed models that include SNP-fixed effects, (2) bivariate and multivariate mixed models including each SNP as an additional selection trait were used. Within either the univariate or multivariate approach, correlations between the tests of significance approached +1; however, correspondence between the two approaches was less strong, although between-approach correlations remained significantly positive. Similar SNP markers would be selected using multivariate analyses and standard marker-trait association methods, where the former facilitates integration into the existing genetic analysis systems of applied breeding programs and may be used with either single markers or indices of markers created with genomic selection processes.
Resumo:
The shell traits and weight traits are measured in cultured populations of bay scallop, Argopecten irradians. The results of regression analysis show that the regression relationships for all the traits are significant (P < 0.01). The correlative coefficients between body weight, as well as tissue weight with shell length, shell height and shell width are significant (P < 0.05). But the correlative coefficients between the anterior and posterior auricle length with body weight as well as tissue weight are not significant (P > 0.05). The multiple regression equation is obtained to estimate live body weight and tissue weight. The above traits except anterior and posterior auricle length are used for the growth and production comparison among three cultured populations, Duncan's new multiple range procedure analysis shows that all the traits in the Lingshuiqiao (LSQ) population are much more significant than those of the other two populations (P < 0.01), and there is no significant difference between the Qipanmo (QPM) and Dalijia (DLJ) populations in all traits (P > 0.05). The results indicate that the LSQ population has a higher growth rate and is expected to be more productive than the other two populations.
Resumo:
Les traits quantitatifs complexes sont des caractéristiques mesurables d’organismes vivants qui résultent de l’interaction entre plusieurs gènes et facteurs environnementaux. Les locus génétiques liés à un caractère complexe sont appelés «locus de traits quantitatifs » (QTL). Récemment, en considérant les niveaux d’expression tissulaire de milliers de gènes comme des traits quantitatifs, il est devenu possible de détecter des «QTLs d’expression» (eQTL). Alors que ces derniers ont été considérés comme des phénotypes intermédiaires permettant de mieux comprendre l’architecture biologique des traits complexes, la majorité des études visent encore à identifier une mutation causale dans un seul gène. Cette approche ne peut remporter du succès que dans les situations où le gène incriminé a un effet majeur sur le trait complexe, et ne permet donc pas d’élucider les situations où les traits complexes résultent d’interactions entre divers gènes. Cette thèse propose une approche plus globale pour : 1) tenir compte des multiples interactions possibles entre gènes pour la détection de eQTLs et 2) considérer comment des polymorphismes affectant l’expression de plusieurs gènes au sein de groupes de co-expression pourraient contribuer à des caractères quantitatifs complexes. Nos contributions sont les suivantes : Nous avons développé un outil informatique utilisant des méthodes d’analyse multivariées pour détecter des eQTLs et avons montré que cet outil augmente la sensibilité de détection d’une classe particulière de eQTLs. Sur la base d’analyses de données d’expression de gènes dans des tissus de souris recombinantes consanguines, nous avons montré que certains polymorphismes peuvent affecter l’expression de plusieurs gènes au sein de domaines géniques de co-expression. En combinant des études de détection de eQTLs avec des techniques d’analyse de réseaux de co-expression de gènes dans des souches de souris recombinantes consanguines, nous avons montré qu’un locus génétique pouvait être lié à la fois à l’expression de plusieurs gènes au niveau d’un domaine génique de co-expression et à un trait complexe particulier (c.-à-d. la masse du ventricule cardiaque gauche). Au total, nos études nous ont permis de détecter plusieurs mécanismes par lesquels des polymorphismes génétiques peuvent être liés à l’expression de plusieurs gènes, ces derniers pouvant eux-mêmes être liés à des traits quantitatifs complexes.
Resumo:
Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)
Resumo:
There are numerous statistical methods for quantitative trait linkage analysis in human studies. An ideal such method would have high power to detect genetic loci contributing to the trait, would be robust to non-normality in the phenotype distribution, would be appropriate for general pedigrees, would allow the incorporation of environmental covariates, and would be appropriate in the presence of selective sampling. We recently described a general framework for quantitative trait linkage analysis, based on generalized estimating equations, for which many current methods are special cases. This procedure is appropriate for general pedigrees and easily accommodates environmental covariates. In this paper, we use computer simulations to investigate the power robustness of a variety of linkage test statistics built upon our general framework. We also propose two novel test statistics that take account of higher moments of the phenotype distribution, in order to accommodate non-normality. These new linkage tests are shown to have high power and to be robust to non-normality. While we have not yet examined the performance of our procedures in the context of selective sampling via computer simulations, the proposed tests satisfy all of the other qualities of an ideal quantitative trait linkage analysis method.
Resumo:
Linkage disequilibrium methods can be used to find genes influencing quantitative trait variation in humans. Linkage disequilibrium methods can require smaller sample sizes than linkage equilibrium methods, such as the variance component approach to find loci with a specific effect size. The increase in power is at the expense of requiring more markers to be typed to scan the entire genome. This thesis compares different linkage disequilibrium methods to determine which factors influence the power to detect disequilibrium. The costs of disequilibrium and equilibrium tests were compared to determine whether the savings in phenotyping costs when using disequilibrium methods outweigh the additional genotyping costs.^ Nine linkage disequilibrium tests were examined by simulation. Five tests involve selecting isolated unrelated individuals while four involved the selection of parent child trios (TDT). All nine tests were found to be able to identify disequilibrium with the correct significance level in Hardy-Weinberg populations. Increasing linked genetic variance and trait allele frequency were found to increase the power to detect disequilibrium, while increasing the number of generations and distance between marker and trait loci decreased the power to detect disequilibrium. Discordant sampling was used for several of the tests. It was found that the more stringent the sampling, the greater the power to detect disequilibrium in a sample of given size. The power to detect disequilibrium was not affected by the presence of polygenic effects.^ When the trait locus had more than two trait alleles, the power of the tests maximized to less than one. For the simulation methods used here, when there were more than two-trait alleles there was a probability equal to 1-heterozygosity of the marker locus that both trait alleles were in disequilibrium with the same marker allele, resulting in the marker being uninformative for disequilibrium.^ The five tests using isolated unrelated individuals were found to have excess error rates when there was disequilibrium due to population admixture. Increased error rates also resulted from increased unlinked major gene effects, discordant trait allele frequency, and increased disequilibrium. Polygenic effects did not affect the error rates. The TDT, Transmission Disequilibrium Test, based tests were not liable to any increase in error rates.^ For all sample ascertainment costs, for recent mutations ($<$100 generations) linkage disequilibrium tests were less expensive than the variance component test to carry out. Candidate gene scans saved even more money. The use of recently admixed populations also decreased the cost of performing a linkage disequilibrium test. ^