7 resultados para genome wide complex trait analysis
em Dalarna University College Electronic Archive
Resumo:
Understanding the genetic basis of traits involved in adaptation is a major challenge in evolutionary biology but remains poorly understood. Here, we use genome-wide association mapping using a custom 50 k single nucleotide polymorphism (SNP) array in a natural population of collared flycatchers to examine the genetic basis of clutch size, an important life-history trait in many animal species. We found evidence for an association on chromosome 18 where one SNP significant at the genome-wide level explained 3.9% of the phenotypic variance. We also detected two suggestive quantitative trait loci (QTLs) on chromosomes 9 and 26. Fitness differences among genotypes were generally weak and not significant, although there was some indication of a sex-by-genotype interaction for lifetime reproductive success at the suggestive QTL on chromosome 26. This implies that sexual antagonism may play a role in maintaining genetic variation at this QTL. Our findings provide candidate regions for a classic avian life-history trait that will be useful for future studies examining the molecular and cellular function of, as well as evolutionary mechanisms operating at, these loci.
Resumo:
1. Genomewide association studies (GWAS) enable detailed dissections of the genetic basis for organisms' ability to adapt to a changing environment. In long-term studies of natural populations, individuals are often marked at one point in their life and then repeatedly recaptured. It is therefore essential that a method for GWAS includes the process of repeated sampling. In a GWAS, the effects of thousands of single-nucleotide polymorphisms (SNPs) need to be fitted and any model development is constrained by the computational requirements. A method is therefore required that can fit a highly hierarchical model and at the same time is computationally fast enough to be useful. 2. Our method fits fixed SNP effects in a linear mixed model that can include both random polygenic effects and permanent environmental effects. In this way, the model can correct for population structure and model repeated measures. The covariance structure of the linear mixed model is first estimated and subsequently used in a generalized least squares setting to fit the SNP effects. The method was evaluated in a simulation study based on observed genotypes from a long-term study of collared flycatchers in Sweden. 3. The method we present here was successful in estimating permanent environmental effects from simulated repeated measures data. Additionally, we found that especially for variable phenotypes having large variation between years, the repeated measurements model has a substantial increase in power compared to a model using average phenotypes as a response. 4. The method is available in the R package RepeatABEL. It increases the power in GWAS having repeated measures, especially for long-term studies of natural populations, and the R implementation is expected to facilitate modelling of longitudinal data for studies of both animal and human populations.
Resumo:
This thesis develops and evaluates statistical methods for different types of genetic analyses, including quantitative trait loci (QTL) analysis, genome-wide association study (GWAS), and genomic evaluation. The main contribution of the thesis is to provide novel insights in modeling genetic variance, especially via random effects models. In variance component QTL analysis, a full likelihood model accounting for uncertainty in the identity-by-descent (IBD) matrix was developed. It was found to be able to correctly adjust the bias in genetic variance component estimation and gain power in QTL mapping in terms of precision. Double hierarchical generalized linear models, and a non-iterative simplified version, were implemented and applied to fit data of an entire genome. These whole genome models were shown to have good performance in both QTL mapping and genomic prediction. A re-analysis of a publicly available GWAS data set identified significant loci in Arabidopsis that control phenotypic variance instead of mean, which validated the idea of variance-controlling genes. The works in the thesis are accompanied by R packages available online, including a general statistical tool for fitting random effects models (hglm), an efficient generalized ridge regression for high-dimensional data (bigRR), a double-layer mixed model for genomic data analysis (iQTL), a stochastic IBD matrix calculator (MCIBD), a computational interface for QTL mapping (qtl.outbred), and a GWAS analysis tool for mapping variance-controlling loci (vGWAS).
Resumo:
The aim of this paper is to develop a flexible model for analysis of quantitative trait loci (QTL) in outbred line crosses, which includes both additive and dominance effects. Our flexible intercross analysis (FIA) model accounts for QTL that are not fixed within founder lines and is based on the variance component framework. Genome scans with FIA are performed using a score statistic, which does not require variance component estimation. RESULTS: Simulations of a pedigree with 800 F2 individuals showed that the power of FIA including both additive and dominance effects was almost 50% for a QTL with equal allele frequencies in both lines with complete dominance and a moderate effect, whereas the power of a traditional regression model was equal to the chosen significance value of 5%. The power of FIA without dominance effects included in the model was close to those obtained for FIA with dominance for all simulated cases except for QTL with overdominant effects. A genome-wide linkage analysis of experimental data from an F2 intercross between Red Jungle Fowl and White Leghorn was performed with both additive and dominance effects included in FIA. The score values for chicken body weight at 200 days of age were similar to those obtained in FIA analysis without dominance. CONCLUSION: We have extended FIA to include QTL dominance effects. The power of FIA was superior, or similar, to standard regression methods for QTL effects with dominance. The difference in power for FIA with or without dominance is expected to be small as long as the QTL effects are not overdominant. We suggest that FIA with only additive effects should be the standard model to be used, especially since it is more computationally efficient.
Resumo:
Background: Linkage mapping is used to identify genomic regions affecting the expression of complex traits. However, when experimental crosses such as F2 populations or backcrosses are used to map regions containing a Quantitative Trait Locus (QTL), the size of the regions identified remains quite large, i.e. 10 or more Mb. Thus, other experimental strategies are needed to refine the QTL locations. Advanced Intercross Lines (AIL) are produced by repeated intercrossing of F2 animals and successive generations, which decrease linkage disequilibrium in a controlled manner. Although this approach is seen as promising, both to replicate QTL analyses and fine-map QTL, only a few AIL datasets, all originating from inbred founders, have been reported in the literature. Methods: We have produced a nine-generation AIL pedigree (n = 1529) from two outbred chicken lines divergently selected for body weight at eight weeks of age. All animals were weighed at eight weeks of age and genotyped for SNP located in nine genomic regions where significant or suggestive QTL had previously been detected in the F2 population. In parallel, we have developed a novel strategy to analyse the data that uses both genotype and pedigree information of all AIL individuals to replicate the detection of and fine-map QTL affecting juvenile body weight. Results: Five of the nine QTL detected with the original F2 population were confirmed and fine-mapped with the AIL, while for the remaining four, only suggestive evidence of their existence was obtained. All original QTL were confirmed as a single locus, except for one, which split into two linked QTL. Conclusions: Our results indicate that many of the QTL, which are genome-wide significant or suggestive in the analyses of large intercross populations, are true effects that can be replicated and fine-mapped using AIL. Key factors for success are the use of large populations and powerful statistical tools. Moreover, we believe that the statistical methods we have developed to efficiently study outbred AIL populations will increase the number of organisms for which in-depth complex traits can be analyzed.
Resumo:
Obesity is heritable and predisposes to many diseases. To understand the genetic basis of obesity better, here we conduct a genome-wide association study and Metabochip meta-analysis of body mass index (BMI), a measure commonly used to define obesity and assess adiposity, in upto 339,224 individuals. This analysis identifies 97 BMI-associated loci (P < 5 x 10(-8)), 56 of which are novel. Five loci demonstrate clear evidence of several independent association signals, and many loci have significant effects on other metabolic phenotypes. The 97 loci account for similar to 2.7% of BMI variation, and genome-wide estimates suggest that common variation accounts for >20% of BMI variation. Pathway analyses provide strong support for a role of the central nervous systemin obesity susceptibility and implicate new genes and pathways, including those related to synaptic function, glutamate signalling, insulin secretion/action, energy metabolism, lipid biology and adipogenesis.