962 resultados para Genome wide mapping
Resumo:
The domestication and selection processes in pigs and rabbits have resulted in the constitution of multiple breeds with broad phenotypic diversity. Population genomics analysis and Genome-wide association study analysis can be utilized to gain insights into the ancestral origins, genetic diversity, and the presence of lethal mutations across these diverse breeds. In this thesis, we analysed the dataset obtained from three Italian Pig breeds to detect deleterious alleles. We screened the dataset for genetic markers showing homozygous deficiency using two approaches single marker and haplotype-based approach. Moreover, Genome-wide association study analyses were performed to detect genetic markers associated with pigs' reproductive traits. In rabbits, we investigated the application of SNP bead chip for detection signatures of selection in rabbits using different methods. This analysis was implemented for the first time in different fancy and meet rabbit breeds. Multiple approaches were utilized for the detection of the selection of signatures including Fst analysis, ROH analysis, PCAdapt analysis, and haplotype-based analysis. The analysis in pigs was able to identify five putative deleterious SNPs and nine putative deleterious haplotypes in the analysed Italian Pig breeds. The genomic regions of the detected putative deleterious genomic markers harboring loss of function variants such as the Frameshift variant, start lost, and splice donor variant. Those variants are close to important candidate genes such as IGF2BP1, ADGRL4, and HGF. In rabbits, multiple genomic regions were detected to be under selection of signature. These genomic regions harbor candidate genes associated with coat color phenotype (MC1R, TYR, and ASIP), hair structure (LIPH), and body size (HMGA2 and COL2A1). The described results in rabbits and pigs could be used to improve breeding programs by excluding the deleterious genetic markers carriers and incorporating candidate genes for coat color, body size, and meat production in rabbit breeding programs to enhance desired traits
Resumo:
BACKGROUND: The Nuclear Factor I (NFI) family of DNA binding proteins (also called CCAAT box transcription factors or CTF) is involved in both DNA replication and gene expression regulation. Using chromatin immuno-precipitation and high throughput sequencing (ChIP-Seq), we performed a genome-wide mapping of NFI DNA binding sites in primary mouse embryonic fibroblasts. RESULTS: We found that in vivo and in vitro NFI DNA binding specificities are indistinguishable, as in vivo ChIP-Seq NFI binding sites matched predictions based on previously established position weight matrix models of its in vitro binding specificity. Combining ChIP-Seq with mRNA profiling data, we found that NFI preferentially associates with highly expressed genes that it up-regulates, while binding sites were under-represented at expressed but unregulated genes. Genomic binding also correlated with markers of transcribed genes such as histone modifications H3K4me3 and H3K36me3, even outside of annotated transcribed loci, implying NFI in the control of the deposition of these modifications. Positional correlation between + and - strand ChIP-Seq tags revealed that, in contrast to other transcription factors, NFI associates with a nucleosomal length of cleavage-resistant DNA, suggesting an interaction with positioned nucleosomes. In addition, NFI binding prominently occurred at boundaries displaying discontinuities in histone modifications specific of expressed and silent chromatin, such as loci submitted to parental allele-specific imprinted expression. CONCLUSIONS: Our data thus suggest that NFI nucleosomal interaction may contribute to the partitioning of distinct chromatin domains and to epigenetic gene expression regulation.NFI ChIP-Seq and input control DNA data were deposited at Gene Expression Omnibus (GEO) repository under accession number GSE15844. Gene expression microarray data for mouse embryonic fibroblasts are on GEO accession number GSE15871.
Resumo:
The complexity of mammalian genome organization demands a complex interplay of DNA and proteins to orchestrate proper gene regulation. CTCF, a highly conserved, ubiquitously expressed protein has been postulated as a primary organizer of genome architecture because of its roles in transcriptional activation/repression, insulation and imprinting. Diverse regulatory functions are exerted through genome wide binding via a central eleven zinc finger DNA binding domain and an array of diverse protein-protein interactions through N- and C- terminal domains. CTCFL has been identified as a paralog of CTCF expressed only in spermatogenic cells of the testis. CTCF and CTCFL have a highly homologous DNA-binding domain, while the flanking amino acid sequences exhibit no significant similarity. Genome- wide mapping of CTCF binding sites has been carried out in many cell types, but no data exist for CTCFL apart from a few identified loci. The lack of high quality antibodies prompted us to generate an endogenously flag-tagged CTCFL mouse model using BAC recombination. IHC staining using anti-flag antibodies confirmed CTCFL localization to type Β spermatogonia and preleptotene spermatocytes and a mutually exclusive pattern of expression with CTCF. ChIP followed by high-throughput sequencing identified 10,382 binding sites showing 70% overlap but representing only 20% of CTCF sites. Consensus sequence analysis identified a significantly longer binding motif with prominently less ambiguity of base calling at every position. The significant difference between CTCF and CTCFL genomic binding patterns proposes that their binding to DNA is differentially regulated. Analysis of CTCFL binding to methylated regions on a genome wide scale identified approximately 1,000 loci. Methylation-independent binding of CTCFL might be at least one of the mechanisms that ensures distinct binding patterns of CTCF and CTCFL since CTCF binding is methylation- sensitive. Co-localization of CTCF with cohesin has been well established and analysis of CTCFL and SMC3 overlap identified around 3,300 binding sites from which two related but distinct consensus sequence motifs were derived. Because virtually all data for cohesin binding originate from mitotically proliferating cells, the anticipated overlap is expected to be considerably higher in meiotic cells. Meiosis-specific cohesin subunit Rec8 is specific for spermatocytes and 6 out of the 12 identified binding sites are also bound by CTCFL. In conclusion, this was the first genome-wide mapping of CTCFL binding sites in spermatocytes, the only cell type where CTCF is not expressed. CTCFL has a unique binding site repertoire distinct from CTCF, binds to methylated sequences and shows a significant overlap with cohesin binding sites. Future efforts will be oriented towards deciphering the role CTCFL plays in conversion of chromatin structure and function from mitotic to meiotic chromosomes. - La complexité de l'organisation du génome des mammifères exige une interaction particulière entre ADN et protéines pour orchestrer une régulation appropriée de l'expression des gènes. CTCFL, une protéine ubiquitaire très conservée, serait le principal organisateur de l'architecture du génome de par son rôle dans l'activation / la répression de la transcription, la protection et la localisation des gènes. Diverses régulations sont opérées, d'une part au travers d'interactions à différents endroits du génome par le biais d'un domaine protéique central de liaison à l'ADN à onze doigts de zinc, et d'autre part par des interactions protéine-protéine variées au niveau de leur domaine N- et C-terminal. CTCFL a été identifié comme un paralogue de CTCF exprimé uniquement dans les cellules spermatiques du testicule. CTCFL et CTCF ont un domaine de liaison à l'ADN très homologue, tandis que les séquences d'acides aminés situées de part et d'autre de ce domaine ne présentent aucune similitude. Une cartographie générale des sites de liaison au CTCF a été réalisée pour de nombreux types cellulaires, mais il n'existe aucune donnée pour CTCFL à l'exception de l'identification de quelques loci. L'absence d'anticorps de bonne qualité nous a conduit à générer un modèle murin portant un CTCFL endogène taggué grâce à un procédé de recombinaison BAC. Une coloration IHC à l'aide d'anticorps anti-FLAG a confirmé la présence de CTCFL au niveau des spermatogonies de type Β et des spermatocytes au stade préleptotène, et une distribution mutuellement exclusive avec CTCF. Une méthode de Chromatine Immunoprecipitation (ChIP) suivie d'un séquençage à haut débit a permis d'identifier 10.382 sites de liaison montrant 70% d'homologie mais ne représentant que 20% des sites CTCF. L'analyse de la séquence consensus révèle un motif de fixation à l'ADN nettement plus long et qui comporte bien moins de bases aléatoires à chaque position nucléotidique. La différence significative entre les séquences génomiques des sites de liaison au CTCF et CTCFL suggère que leur fixation à l'ADN est régulée différemment. Appliquée à l'échelle du génome, l'étude de l'interaction de CTCFL avec des régions méthylées de l'ADN a permis d'identifier environ 1.000 loci. Contrairement à CTCFL, la liaison de CTCF dépend de l'état de méthylation de l'ADN ; cette modification épigénétique constitue donc au moins un des mécanismes de régulation expliquant une localisation de CTCF et CTCFL à des sites distincts du génome. La co- localisation de CTCF avec la cohésine étant établie, l'analyse de la superposition des séquences de CTCFL avec la sous-unité SMC3 identifie environ 3.300 sites de liaison parmi lesquels deux mêmes motifs consensus distincts par leur séquence sont mis en évidence. La presque quasi-totalité des données sur la cohésine ayant été établie à partir de cellules en prolifération mitotique, il est probable que la similitude au sein des séquences consensus soit encore plus grande dans le cas des cellules en méiose. La sous-unité Rec8 de la cohésine propre à l'état de méiose est spécifiquement exprimée dans les spermatocytes. Or 6 des 12 sites de liaison identifiés sont également utilisés par CTCFL. Pour conclure, ce travail constitue la première cartographie à l'échelle du génome des sites de liaison de CTCFL dans les spermatocytes, seul type cellulaire où CTCFL n'est pas exprimé. CTCFL possède un répertoire unique de sites de fixation à l'ADN distinct de CTCF, se lie à des séquences méthylées et présente un nombre important de sites de liaison communs avec la cohésine. Les perspectives futures sont d'élucider le rôle de CTCFL dans le remodelage de la structure de la chromatine et de définir sa fonction dans le processus de méiose.
Resumo:
Dissertação de Mestrado, Qualidade em Análises - Erasmus Mundus, Faculdade de Ciências e Tecnologia, Universidade do Algarve, 2015
Resumo:
Résumé : La phase haploïde de la spermatogenèse (spermiogenèse) est caractérisée par une modification importante de la structure de la chromatine et un changement de la topologie de l’ADN du spermatide. Les mécanismes par lesquels ce changement se produit ainsi que les protéines impliquées ne sont pas encore complètement élucidés. Mes travaux ont permis d’établir la présence de cassures bicaténaires transitoires pendant ce remodelage par l’essai des comètes et l’électrophorèse en champ pulsé. En procédant à des immunofluorescences sur coupes de tissus et en utilisant un extrait nucléaire hautement actif, la présence de topoisomérases ainsi que de marqueurs de systèmes de réparation a été confirmée. Les protéines de réparation identifiées font partie de systèmes sujets à l’erreur, donc cette refonte structurale de la chromatine pourrait être génétiquement instable et expliquer le biais paternel observé pour les mutations de novo dans de récentes études impliquant des criblages à haut débit. Une technique permettant l’immunocapture spécifique des cassures bicaténaires a été développée et appliquée sur des spermatides murins représentant différentes étapes de différenciation. Les résultats de séquençage à haut débit ont montré que les cassures bicaténaires (hotspots) de la spermiogenèse se produisent en majorité dans l’ADN intergénique, notamment dans les séquences LINE1, l’ADN satellite et les répétions simples. Les hotspots contiennent aussi des motifs de liaisons des protéines des familles FOX et PRDM, dont les fonctions sont entre autres de lier et remodeler localement la chromatine condensée. Aussi, le motif de liaison de la protéine BRCA1 se trouve enrichi dans les hotspots de cassures bicaténaires. Celle-ci agit entre autres dans la réparation de l’ADN par jonction terminale non-homologue (NHEJ) et dans la réparation des adduits ADN-topoisomérase. De façon remarquable, le motif de reconnaissance de la protéine SPO11, impliquée dans la formation des cassures méiotiques, a été enrichi dans les hotspots, ce qui suggère que la machinerie méiotique serait aussi utilisée pendant la spermiogenèse pour la formation des cassures. Enfin, bien que les hotspots se localisent plutôt dans les séquences intergéniques, les gènes ciblés sont impliqués dans le développement du cerveau et des neurones. Ces résultats sont en accord avec l’origine majoritairement paternelle observée des mutations de novo associées aux troubles du spectre de l’autisme et de la schizophrénie et leur augmentation avec l’âge du père. Puisque les processus du remodelage de la chromatine des spermatides sont conservés dans l’évolution, ces résultats suggèrent que le remodelage de la chromatine de la spermiogenèse représente un mécanisme additionnel contribuant à la formation de mutations de novo, expliquant le biais paternel observé pour certains types de mutations.
Resumo:
Topoisomerase I (Top1) poisons are among the most clinically-effective drugs used for colon, ovary and lung cancers. Unpublished data from our lab have recently revealed that the structurally-unrelated Top1 poisons, Camptothecin (CPT) and Indimitecan (LMP776), induce the formation of micronuclei (MNi) in human cancer cells. In addition, MNi trigger an innate immune gene response by stimulating the cGAS/STING pathway. As the mechanisms of MNi formation are not fully determined, our aim is here to establish how MNi form after Top1 poisoning. Using immunofluorescence assays and EdU labelling of nascent DNAs, our results show that, after 24 hours of recovery, a short treatment with sub-cytotoxic doses of Top1 poisons induces the formation of MNi that do not contain newly synthetized (EdU+) DNA. We also saw that Top1 poisons delay replication machinery reducing EdU incorporation and produce significant levels of the damage markers γH2AX and p53BP1 in S-phase cells but not in G1 and G2/M cells. The results also show that MNi formation is dependent on R-loops, as RNaseH1 overexpression markedly reduces Top1 induced MNi. Genome-wide mapping of R-loops by DRIP-seq technique revealed that R-loop levels are both decreased and increased by CPT. In particular, increased R-loops are mainly found at active genes and always overlapped with Top1cc sites. We also found that increased R-loops overlap with lamina-associated chromatin domains while decreased R-loops correlate with replication origin sites. Overall, our data are consistent with the formation of MNi due to R-loop increase and under-replication at specific regions caused by Top1 poisons. These results will eventually help in developing new strategies for effective personalized interventions by using Top1-targeted compounds as immuno-modulators in cancer patients.
Resumo:
Linkage disequilibrium (LD) mapping is commonly used as a fine mapping tool in human genome mapping and has been used with some success for initial disease gene isolation in certain isolated inbred human populations. An understanding of the population history of domestic dog breeds suggests that LID mapping could be routinely utilized in this species for initial genome-wide scans. Such an approach offers significant advantages over traditional linkage analysis. Here, we demonstrate, using canine copper toxicosis in the Bedlington terrier as the model, that LID mapping could be reasonably expected to be a useful strategy in low-resolution, genome-wide scans in pure-bred dogs. Significant LID was demonstrated over distances up to 33.3 cM. It is very unlikely, for a number of reasons discussed, that this result could be extrapolated to the rest of the genome. It is, however, consistent with the expectation given the population structure of canine breeds and, in this breed at least, with the hypothesis that it may be possible to utilize LID in a genome-wide scan. In this study, LD mapping confirmed the location of the copper toxicosis in Bedlington terrier gene (CT-BT) and was able to do so in a population that was refractory to traditional linkage analysis.
Resumo:
The identification of alternatively spliced transcripts has contributed to a better comprehension of developmental mechanisms, tissue-specific physiological processes and human diseases. Polymerase chain reaction amplification of alternatively spliced variants commonly leads to the formation of heteroduplexes as a result of base pairing involving exons common between the two variants. S1 nuclease cleaves single-stranded loops of heteroduplexes and also nicks the opposite DNA strand. In order to establish a strategy for mapping alternative splice-prone sites in the whole transcriptome, we developed a method combining the formation of heteroduplexes between 2 distinct splicing variants and S1 nuclease digestion. For 20 consensuses identified here using this methodology, 5 revealed a conserved splice site after inspection of the cDNA alignment against the human genome (exact splice sites). For 8 other consensuses, conserved splice sites were mapped at 2 to 30 bp from the border, called proximal splice sites; for the other 7 consensuses, conserved splice sites were mapped at 40 to 800 bp, called distal splice sites. These latter cases showed a nonspecific activity of S1 nuclease in digesting double-strand DNA. From the 20 consensuses identified here, 5 were selected for reverse transcription-polymerase chain reaction validation, confirming the splice sites. These data showed the potential of the strategy in mapping splice sites. However, the lack of specificity of the S1 nuclease enzyme is a significant obstacle that impedes the use of this strategy in large-scale studies.
Resumo:
Progressive myoclonus epilepsy (PME) has a number of causes, of which Unverricht-Lundborg disease (ULD) is the most common. ULD has previously been mapped to a locus on chromosome 21 (EPM1). Subsequently, mutations in the cystatin B gene have been found in most cases. In the present work we identified an inbred Arab family with a clinical pattern compatible with ULD, but mutations in the cystatin B gene were absent. We sought to characterize the clinical and molecular features of the disorder. The family was studied by multiple field trips to their town to clarify details of the complex consanguineous relationships and to personally examine the family. DNA was collected for subsequent molecular analyses from 21 individuals. A genome-wide screen was performed using 811 microsatellite markers. Homozygosity mapping was used to identify loci of interest. There were eight affected individuals. Clinical onset was at 7.3 +/- 1.5 years with myoclonic or tonic-clonic seizures. All had myoclonus that progressed in severity over time and seven had tonic-clonic seizures. Ataxia, in addition to myoclonus, occurred in all. Detailed cognitive assessment was not possible, but there was no significant progressive dementia. There was intrafamily variation in severity; three required wheelchairs in adult life; the others could walk unaided. MRI, muscle and skin biopsies on one individual were unremarkable. We mapped the family to a 15-megabase region at the pericentromeric region of chromosome 12 with a maximum lod score of 6.32. Although the phenotype of individual subjects was typical of ULD, the mean age of onset (7.3 years versus 11 years for ULD) was younger. The locus on chromosome 12 does not contain genes for any other form of PME, nor does it have genes known to be related to cystatin B. This represents a new form of PME and we have designated the locus as EPM1B.
Resumo:
Many disorders are associated with altered serum protein concentrations, including malnutrition, cancer, and cardiovascular, kidney, and inflammatory diseases. Although these protein concentrations are highly heritable, relatively little is known about their underlying genetic determinants. Through transethnic meta-analysis of European-ancestry and Japanese genome-wide association studies, we identified six loci at genome-wide significance (p < 5 × 10(-8)) for serum albumin (HPN-SCN1B, GCKR-FNDC4, SERPINF2-WDR81, TNFRSF11A-ZCCHC2, FRMD5-WDR76, and RPS11-FCGRT, in up to 53,190 European-ancestry and 9,380 Japanese individuals) and three loci for total protein (TNFRS13B, 6q21.3, and ELL2, in up to 25,539 European-ancestry and 10,168 Japanese individuals). We observed little evidence of heterogeneity in allelic effects at these loci between groups of European and Japanese ancestry but obtained substantial improvements in the resolution of fine mapping of potential causal variants by leveraging transethnic differences in the distribution of linkage disequilibrium. We demonstrated a functional role for the most strongly associated serum albumin locus, HPN, for which Hpn knockout mice manifest low plasma albumin concentrations. Other loci associated with serum albumin harbor genes related to ribosome function, protein translation, and proteasomal degradation, whereas those associated with serum total protein include genes related to immune function. Our results highlight the advantages of transethnic meta-analysis for the discovery and fine mapping of complex trait loci and have provided initial insights into the underlying genetic architecture of serum protein concentrations and their association with human disease.
Resumo:
CD6 has recently been identified and validated as risk gene for multiple sclerosis (MS), based on the association of a single nucleotide polymorphism (SNP), rs17824933, located in intron 1. CD6 is a cell surface scavenger receptor involved in T-cell activation and proliferation, as well as in thymocyte differentiation. In this study, we performed a haptag SNP screen of the CD6 gene locus using a total of thirteen tagging SNPs, of which three were non-synonymous SNPs, and replicated the recently reported GWAS SNP rs650258 in a Spanish-Basque collection of 814 controls and 823 cases. Validation of the six most strongly associated SNPs was performed in an independent collection of 2265 MS patients and 2600 healthy controls. We identified association of haplotypes composed of two non-synonymous SNPs [rs11230563 (R225W) and rs2074225 (A257V)] in the 2(nd) SRCR domain with susceptibility to MS (P max(T) permutation = 1×10(-4)). The effect of these haplotypes on CD6 surface expression and cytokine secretion was also tested. The analysis showed significantly different CD6 expression patterns in the distinct cell subsets, i.e. - CD4(+) naïve cells, P = 0.0001; CD8(+) naïve cells, P<0.0001; CD4(+) and CD8(+) central memory cells, P = 0.01 and 0.05, respectively; and natural killer T (NKT) cells, P = 0.02; with the protective haplotype (RA) showing higher expression of CD6. However, no significant changes were observed in natural killer (NK) cells, effector memory and terminally differentiated effector memory T cells. Our findings reveal that this new MS-associated CD6 risk haplotype significantly modifies expression of CD6 on CD4(+) and CD8(+) T cells.
Resumo:
β-blockers and β-agonists are primarily used to treat cardiovascular diseases. Inter-individual variability in response to both drug classes is well recognized, yet the identity and relative contribution of the genetic players involved are poorly understood. This work is the first genome-wide association study (GWAS) addressing the values and susceptibility of cardiovascular-related traits to a selective β(1)-blocker, Atenolol (ate), and a β-agonist, Isoproterenol (iso). The phenotypic dataset consisted of 27 highly heritable traits, each measured across 22 inbred mouse strains and four pharmacological conditions. The genotypic panel comprised 79922 informative SNPs of the mouse HapMap resource. Associations were mapped by Efficient Mixed Model Association (EMMA), a method that corrects for the population structure and genetic relatedness of the various strains. A total of 205 separate genome-wide scans were analyzed. The most significant hits include three candidate loci related to cardiac and body weight, three loci for electrocardiographic (ECG) values, two loci for the susceptibility of atrial weight index to iso, four loci for the susceptibility of systolic blood pressure (SBP) to perturbations of the β-adrenergic system, and one locus for the responsiveness of QTc (p<10(-8)). An additional 60 loci were suggestive for one or the other of the 27 traits, while 46 others were suggestive for one or the other drug effects (p<10(-6)). Most hits tagged unexpected regions, yet at least two loci for the susceptibility of SBP to β-adrenergic drugs pointed at members of the hypothalamic-pituitary-thyroid axis. Loci for cardiac-related traits were preferentially enriched in genes expressed in the heart, while 23% of the testable loci were replicated with datasets of the Mouse Phenome Database (MPD). Altogether these data and validation tests indicate that the mapped loci are relevant to the traits and responses studied.
Resumo:
Inherited retinal dystrophies are phenotypically and genetically heterogeneous. This extensive heterogeneity poses a challenge when performing molecular diagnosis of patients, especially in developing countries. In this study, we applied homozygosity mapping as a tool to reduce the complexity given by genetic heterogeneity and identify disease-causing variants in consanguineous Pakistani pedigrees. DNA samples from eight families with autosomal recessive retinal dystrophies were subjected to genome wide homozygosity mapping (seven by SNP arrays and one by STR markers) and genes comprised within the detected homozygous regions were analyzed by Sanger sequencing. All families displayed consistent autozygous genomic regions. Sequence analysis of candidate genes identified four previously-reported mutations in CNGB3, CNGA3, RHO, and PDE6A, as well as three novel mutations: c.2656C > T (p.L886F) in RPGRIP1, c.991G > C (p.G331R) in CNGA3, and c.413-1G > A (IVS6-1G > A) in CNGB1. This latter mutation impacted pre-mRNA splicing of CNGB1 by creating a -1 frameshift leading to a premature termination codon. In addition to better delineating the genetic landscape of inherited retinal dystrophies in Pakistan, our data confirm that combining homozygosity mapping and candidate gene sequencing is a powerful approach for mutation identification in populations where consanguineous unions are common.
Resumo:
Background: Association mapping, initially developed in human disease genetics, is now being applied to plant species. The model species Arabidopsis provided some of the first examples of association mapping in plants, identifying previously cloned flowering time genes, despite high population sub-structure. More recently, association genetics has been applied to barley, where breeding activity has resulted in a high degree of population sub-structure. A major genotypic division within barley is that between winter- and spring-sown varieties, which differ in their requirement for vernalization to promote subsequent flowering. To date, all attempts to validate association genetics in barley by identifying major flowering time loci that control vernalization requirement (VRN-H1 and VRN-H2) have failed. Here, we validate the use of association genetics in barley by identifying VRN-H1 and VRN-H2, despite their prominent role in determining population sub-structure. Results: By taking barley as a typical inbreeding crop, and seasonal growth habit as a major partitioning phenotype, we develop an association mapping approach which successfully identifies VRN-H1 and VRN-H2, the underlying loci largely responsible for this agronomic division. We find a combination of Structured Association followed by Genomic Control to correct for population structure and inflation of the test statistic, resolved significant associations only with VRN-H1 and the VRN-H2 candidate genes, as well as two genes closely linked to VRN-H1 (HvCSFs1 and HvPHYC). Conclusion: We show that, after employing appropriate statistical methods to correct for population sub-structure, the genome-wide partitioning effect of allelic status at VRN-H1 and VRN-H2 does not result in the high levels of spurious association expected to occur in highly structured samples. Furthermore, we demonstrate that both VRN-H1 and the candidate VRN-H2 genes can be identified using association mapping. Discrimination between intragenic VRN-H1 markers was achieved, indicating that candidate causative polymorphisms may be discerned and prioritised within a larger set of positive associations. This proof of concept study demonstrates the feasibility of association mapping in barley, even within highly structured populations. A major advantage of this method is that it does not require large numbers of genome-wide markers, and is therefore suitable for fine mapping and candidate gene evaluation, especially in species for which large numbers of genetic markers are either unavailable or too costly.
Resumo:
Background: Linkage mapping is used to identify genomic regions affecting the expression of complex traits. However, when experimental crosses such as F2 populations or backcrosses are used to map regions containing a Quantitative Trait Locus (QTL), the size of the regions identified remains quite large, i.e. 10 or more Mb. Thus, other experimental strategies are needed to refine the QTL locations. Advanced Intercross Lines (AIL) are produced by repeated intercrossing of F2 animals and successive generations, which decrease linkage disequilibrium in a controlled manner. Although this approach is seen as promising, both to replicate QTL analyses and fine-map QTL, only a few AIL datasets, all originating from inbred founders, have been reported in the literature. Methods: We have produced a nine-generation AIL pedigree (n = 1529) from two outbred chicken lines divergently selected for body weight at eight weeks of age. All animals were weighed at eight weeks of age and genotyped for SNP located in nine genomic regions where significant or suggestive QTL had previously been detected in the F2 population. In parallel, we have developed a novel strategy to analyse the data that uses both genotype and pedigree information of all AIL individuals to replicate the detection of and fine-map QTL affecting juvenile body weight. Results: Five of the nine QTL detected with the original F2 population were confirmed and fine-mapped with the AIL, while for the remaining four, only suggestive evidence of their existence was obtained. All original QTL were confirmed as a single locus, except for one, which split into two linked QTL. Conclusions: Our results indicate that many of the QTL, which are genome-wide significant or suggestive in the analyses of large intercross populations, are true effects that can be replicated and fine-mapped using AIL. Key factors for success are the use of large populations and powerful statistical tools. Moreover, we believe that the statistical methods we have developed to efficiently study outbred AIL populations will increase the number of organisms for which in-depth complex traits can be analyzed.