962 resultados para Genome wide mapping
Resumo:
This thesis develops and evaluates statistical methods for different types of genetic analyses, including quantitative trait loci (QTL) analysis, genome-wide association study (GWAS), and genomic evaluation. The main contribution of the thesis is to provide novel insights in modeling genetic variance, especially via random effects models. In variance component QTL analysis, a full likelihood model accounting for uncertainty in the identity-by-descent (IBD) matrix was developed. It was found to be able to correctly adjust the bias in genetic variance component estimation and gain power in QTL mapping in terms of precision. Double hierarchical generalized linear models, and a non-iterative simplified version, were implemented and applied to fit data of an entire genome. These whole genome models were shown to have good performance in both QTL mapping and genomic prediction. A re-analysis of a publicly available GWAS data set identified significant loci in Arabidopsis that control phenotypic variance instead of mean, which validated the idea of variance-controlling genes. The works in the thesis are accompanied by R packages available online, including a general statistical tool for fitting random effects models (hglm), an efficient generalized ridge regression for high-dimensional data (bigRR), a double-layer mixed model for genomic data analysis (iQTL), a stochastic IBD matrix calculator (MCIBD), a computational interface for QTL mapping (qtl.outbred), and a GWAS analysis tool for mapping variance-controlling loci (vGWAS).
Translocation capture sequencing: A method for high throughput mapping of chromosomal rearrangements
Resumo:
Chromosomal translocations require formation and joining of DNA double strand breaks (DSBs). These events disrupt the integrity of the genome and are involved in producing leukemias, lymphomas and sarcomas. Translocations are frequent, clonal and recurrent in mature B cell lymphomas, which bear a particularly high DNA damage burden by virtue of activation-induced cytidine deaminase (AID) expression. Despite the ubiquity of genomic rearrangements, the forces that underlie their genesis are not well understood. Here, we provide a detailed description of a new method for studying these events, translocation capture sequencing (TC-Seq). TC-Seq provides the means to document chromosomal rearrangements genome-wide in primary cells, and to discover recombination hotspots. Demonstrating its effectiveness, we successfully estimate the frequency of c-myc/IgH translocations in primary B cells, and identify hotspots of AID-mediated recombination. Furthermore. TC-Seq can be adapted to generate genome-wide rearrangement maps in any cell type and under any condition. (C) 2011 Elsevier B.V. All rights reserved.
Resumo:
Congenital anomalies of the kidney and urinary tract (CAKUT) account for the majority of end-stage renal disease in children (50%). Previous studies have mapped autosomal dominant loci for CAKUT. We here report a genome-wide search for linkage in a large pedigree of Somalian descent containing eight affected individuals with a non-syndromic form of CAKUT.
Resumo:
The domestic dog offers a unique opportunity to explore the genetic basis of disease, morphology and behaviour. Humans share many diseases with our canine companions, making dogs an ideal model organism for comparative disease genetics. Using newly developed resources, genome-wide association studies in dog breeds are proving to be exceptionally powerful. Towards this aim, veterinarians and geneticists from 12 European countries are collaborating to collect and analyse the DNA from large cohorts of dogs suffering from a range of carefully defined diseases of relevance to human health. This project, named LUPA, has already delivered considerable results. The consortium has collaborated to develop a new high density single nucleotide polymorphism (SNP) array. Mutations for four monogenic diseases have been identified and the information has been utilised to find mutations in human patients. Several complex diseases have been mapped and fine mapping is underway. These findings should ultimately lead to a better understanding of the molecular mechanisms underlying complex diseases in both humans and their best friend.
Resumo:
β-blockers and β-agonists are primarily used to treat cardiovascular diseases. Inter-individual variability in response to both drug classes is well recognized, yet the identity and relative contribution of the genetic players involved are poorly understood. This work is the first genome-wide association study (GWAS) addressing the values and susceptibility of cardiovascular-related traits to a selective β(1)-blocker, Atenolol (ate), and a β-agonist, Isoproterenol (iso). The phenotypic dataset consisted of 27 highly heritable traits, each measured across 22 inbred mouse strains and four pharmacological conditions. The genotypic panel comprised 79922 informative SNPs of the mouse HapMap resource. Associations were mapped by Efficient Mixed Model Association (EMMA), a method that corrects for the population structure and genetic relatedness of the various strains. A total of 205 separate genome-wide scans were analyzed. The most significant hits include three candidate loci related to cardiac and body weight, three loci for electrocardiographic (ECG) values, two loci for the susceptibility of atrial weight index to iso, four loci for the susceptibility of systolic blood pressure (SBP) to perturbations of the β-adrenergic system, and one locus for the responsiveness of QTc (p<10(-8)). An additional 60 loci were suggestive for one or the other of the 27 traits, while 46 others were suggestive for one or the other drug effects (p<10(-6)). Most hits tagged unexpected regions, yet at least two loci for the susceptibility of SBP to β-adrenergic drugs pointed at members of the hypothalamic-pituitary-thyroid axis. Loci for cardiac-related traits were preferentially enriched in genes expressed in the heart, while 23% of the testable loci were replicated with datasets of the Mouse Phenome Database (MPD). Altogether these data and validation tests indicate that the mapped loci are relevant to the traits and responses studied.
Resumo:
Recurrent airway obstruction (RAO), or heaves, is a naturally occurring asthma-like disease that is related to sensitisation and exposure to mouldy hay and has a familial basis with a complex mode of inheritance. A genome-wide scanning approach using two half-sibling families was taken in order to locate the chromosome regions that contribute to the inherited component of this condition in these families. Initially, a panel of 250 microsatellite markers, which were chosen as a well-spaced, polymorphic selection covering the 31 equine autosomes, was used to genotype the two half-sibling families, which comprised in total 239 Warmblood horses. Subsequently, supplementary markers were added for a total of 315 genotyped markers. Each half-sibling family is focused around a severely RAO-affected stallion, and the phenotype of each individual was assessed for RAO and related signs, namely, breathing effort at rest, breathing effort at work, coughing, and nasal discharge, using an owner-based questionnaire. Analysis using a regression method for half-sibling family structures was performed using RAO and each of the composite clinical signs separately; two chromosome regions (on ECA13 and ECA15) showed a genome-wide significant association with RAO at P < 0.05. An additional 11 chromosome regions showed a more modest association. This is the first publication that describes the mapping of genetic loci involved in RAO. Several candidate genes are located in these regions, a number of which are interleukins. These are important signalling molecules that are intricately involved in the control of the immune response and are therefore good positional candidates.
Resumo:
The development of a completely annotated sheep genome sequence is a key need for understanding the phylogenetic relationships and genetic diversity among the many different sheep breeds worldwide and for identifying genes controlling economically and physiologically important traits. The ovine genome sequence assembly will be crucial for developing optimized breeding programs based on highly productive, healthy sheep phenotypes that are adapted to modern breeding and production conditions. Scientists and breeders around the globe have been contributing to this goal by generating genomic and cDNA libraries, performing genome-wide and trait-associated analyses of polymorphism, expression analysis, genome sequencing, and by developing virtual and physical comparative maps. The International Sheep Genomics Consortium (ISGC), an informal network of sheep genomics researchers, is playing a major role in coordinating many of these activities. In addition to serving as an essential tool for monitoring chromosome abnormalities in specific sheep populations, ovine molecular cytogenetics provides physical anchors which link and order genome regions, such as sequence contigs, genes and polymorphic DNA markers to ovine chromosomes. Likewise, molecular cytogenetics can contribute to the process of defining evolutionary breakpoints between related species. The selective expansion of the sheep cytogenetic map, using loci to connect maps and identify chromosome bands, can substantially contribute to improving the quality of the annotated sheep genome sequence and will also accelerate its assembly. Furthermore, identifying major morphological chromosome anomalies and micro-rearrangements, such as gene duplications or deletions, that might occur between different sheep breeds and other Ovis species will also be important to understand the diversity of sheep chromosome structure and its implications for cross-breeding. To date, 566 loci have been assigned to specific chromosome regions in sheep and the new cytogenetic map is presented as part of this review. This review will also summarize the current cytogenomic status of the sheep genome, describe current activities in the sheep cytogenomics research sector, and will discuss the cytogenomics data in context with other major sheep genomics projects.
Replication and fine-mapping of a QTL for recurrent airway obstruction in European Warmblood horses.
Resumo:
Recurrent airway obstruction (RAO), or 'heaves', is a common performance-limiting allergic respiratory disease of mature horses. It is related to sensitization and exposure to mouldy hay and has a familial basis with a complex mode of inheritance. In a previous study, we detected a QTL for RAO on ECA 13 in a half-sib family of European Warmblood horses. In this study, we genotyped additional markers in the family and narrowed the QTL down to about 1.5 Mb (23.7-25.2 Mb). We detected the strongest association with SNP BIEC2-224511 (24,309,405 bp). We also obtained SNP genotypes in an independent cohort of 646 unrelated Warmblood horses. There was no genome-wide significant association with RAO in these unrelated horses. However, we performed a genotypic association study of the SNPs on ECA 13 in these unrelated horses, and the SNP BIEC2-224511 also showed the strongest association with RAO in the unrelated horses (p(raw) = 0.00037). The T allele at this SNP was associated with RAO both in the family and the unrelated horses. Thus, the association study in the unrelated animals provides independent support for the previously detected QTL. The association study allows further narrowing of the QTL interval to about 0.5 Mb (24.0-24.5 Mb). We sequenced the coding regions of the genes in the critical region but did not find any associated coding variants. Therefore, the causative variant underlying this QTL is likely to be a regulatory mutation.
Resumo:
Over 250 Mendelian traits and disorders, caused by rare alleles have been mapped in the canine genome. Although each disease is rare in the dog as a species, they are collectively common and have major impact on canine health. With SNP-based genotyping arrays, genome-wide association studies (GWAS) have proven to be a powerful method to map the genomic region of interest when 10-20 cases and 10-20 controls are available. However, to identify the genetic variant in associated regions, fine-mapping and targeted re-sequencing is required. Here we present a new approach using whole-genome sequencing (WGS) of a family trio without prior GWAS. As a proof-of-concept, we chose an autosomal recessive disease known as hereditary footpad hyperkeratosis (HFH) in Kromfohrl änder dogs. To our knowledge, this is the first time this family trio WGS-approach, has successfully been used to identify a genetic variant that perfectly segregates with a canine disorder. The sequencing of three Kromfohrl änder dogs from a family trio (an affected offspring and both its healthy parents) resulted in an average genome coverage of 9.2X per individual. After applying stringent filtering criteria for candidate causative coding variants, 527 single nucleotide variants (SNVs) and 15 indels were found to be homozygous in the affected offspring and heterozygous in the parents. Using the computer software packages ANNOVAR and SIFT to functionally annotate coding sequence differences and to predict their functional effect, resulted in seven candidate variants located in six different genes. Of these, only FAM83G:c155G>C (p.R52P) was found to be concordant in eight additional cases and 16 healthy Kromfohrl änder dogs.
Resumo:
Thoracic aortic aneurysms leading to aortic dissections (TAAD) are a major cause of morbidity and mortality in the United States. TAAD is a complication of some known genetic disorders, such as Marfan syndrome and Turner syndrome, but the majority of familial cases are not due to a known genetic syndrome. Previous studies by our group have established that nonsyndromic, familial TAAD is inherited in an autosomal dominant manner with decreased penetrance and variable expression. Using one large family with multiple members with TAAD for the genome wide scan, a major locus for familial TAAD was mapped to 5q13–14 (TAAD1). Nine out of 15 families studied were linked to this locus, establishing that TAAD1 was a major locus, and that there was genetic heterogeneity for the condition. Mapping of TAAD2 locus was accomplished using a single large family with multiple members with TAAD not linked to known loci of aneurysm formation. This established a second novel locus for familial TAAD on 3p24–25 (LOD score of 4.3), termed the TAAD2 locus. Two putative loci with suggestive LOD scores were mapped on 4q and 12q through a genome scan carried out using three families. TAAD phenotype in 12 families did not segregate with known loci, indicating further genetic heterogeneity. An STS-tagged BAC based contig was constructed for 7.8Mb and 25Mb critical interval of TAAD1 and TAAD2 respectively and characterized to identify the defective gene. The hypothesis that the defective genes responsible for the TAAD1 and TAAD2 encoded extracellular matrix (ECM) proteins, the major components of the elastic fiber system in the aortic media was tested. Four genes encoding ECM proteins, versican, thrombospondin-3, CRTL1, on TAAD1 and FBLN2 at TAAD2 were sequenced, but no disease-causing mutations were identified. Studies to identify the defective gene are initiated through the positional candidate gene approach using combination of bioinformatics and expression studies. The identification of the TAAD susceptibility genes will allow for presymptomatic diagnosis of individuals at risk for this life threatening disease. The identification of the molecular defects that contribute to TAAD will also further our understanding of the proteins that provide structural integrity to the aortic wall. ^
Resumo:
BACKGROUND: Multiple recent genome-wide association studies (GWAS) have identified a single nucleotide polymorphism (SNP), rs10771399, at 12p11 that is associated with breast cancer risk. METHOD: We performed a fine-scale mapping study of a 700 kb region including 441 genotyped and more than 1300 imputed genetic variants in 48,155 cases and 43,612 controls of European descent, 6269 cases and 6624 controls of East Asian descent and 1116 cases and 932 controls of African descent in the Breast Cancer Association Consortium (BCAC; http://bcac.ccge.medschl.cam.ac.uk/ ), and in 15,252 BRCA1 mutation carriers in the Consortium of Investigators of Modifiers of BRCA1/2 (CIMBA). Stepwise regression analyses were performed to identify independent association signals. Data from the Encyclopedia of DNA Elements project (ENCODE) and the Cancer Genome Atlas (TCGA) were used for functional annotation. RESULTS: Analysis of data from European descendants found evidence for four independent association signals at 12p11, represented by rs7297051 (odds ratio (OR) = 1.09, 95 % confidence interval (CI) = 1.06-1.12; P = 3 × 10(-9)), rs805510 (OR = 1.08, 95 % CI = 1.04-1.12, P = 2 × 10(-5)), and rs1871152 (OR = 1.04, 95 % CI = 1.02-1.06; P = 2 × 10(-4)) identified in the general populations, and rs113824616 (P = 7 × 10(-5)) identified in the meta-analysis of BCAC ER-negative cases and BRCA1 mutation carriers. SNPs rs7297051, rs805510 and rs113824616 were also associated with breast cancer risk at P < 0.05 in East Asians, but none of the associations were statistically significant in African descendants. Multiple candidate functional variants are located in putative enhancer sequences. Chromatin interaction data suggested that PTHLH was the likely target gene of these enhancers. Of the six variants with the strongest evidence of potential functionality, rs11049453 was statistically significantly associated with the expression of PTHLH and its nearby gene CCDC91 at P < 0.05. CONCLUSION: This study identified four independent association signals at 12p11 and revealed potentially functional variants, providing additional insights into the underlying biological mechanism(s) for the association observed between variants at 12p11 and breast cancer risk
Resumo:
We analyzed genome-wide association studies (GWASs), including data from 71,638 individuals from four ancestries, for estimated glomerular filtration rate (eGFR), a measure of kidney function used to define chronic kidney disease (CKD). We identified 20 loci attaining genome-wide-significant evidence of association (p < 5 × 10(-8)) with kidney function and highlighted that allelic effects on eGFR at lead SNPs are homogeneous across ancestries. We leveraged differences in the pattern of linkage disequilibrium between diverse populations to fine-map the 20 loci through construction of "credible sets" of variants driving eGFR association signals. Credible variants at the 20 eGFR loci were enriched for DNase I hypersensitivity sites (DHSs) in human kidney cells. DHS credible variants were expression quantitative trait loci for NFATC1 and RGS14 (at the SLC34A1 locus) in multiple tissues. Loss-of-function mutations in ancestral orthologs of both genes in Drosophila melanogaster were associated with altered sensitivity to salt stress. Renal mRNA expression of Nfatc1 and Rgs14 in a salt-sensitive mouse model was also reduced after exposure to a high-salt diet or induced CKD. Our study (1) demonstrates the utility of trans-ethnic fine mapping through integration of GWASs involving diverse populations with genomic annotation from relevant tissues to define molecular mechanisms by which association signals exert their effect and (2) suggests that salt sensitivity might be an important marker for biological processes that affect kidney function and CKD in humans.
Resumo:
Malignant Pleural Mesothelioma (MPM) is a very aggressive cancer whose incidence is growing worldwide. MPM escapes the classical models of carcinogenesis and lacks a distinctive genetic fingerprint, keeping obscure the molecular events that lead to tumorigenesis. This severely impacts on the limited therapeutic options and on the lack of specific biomarkers, concurring to make MPM one of the deadliest cancers. Here we combined a functional genome-wide loss of function CRISPR/Cas9 screening with patients’ transcriptomic and clinical data, to identify genes essential for MPM progression. Besides, we explored the role of non-coding RNAs to MPM progression by analysing gene expression profiles and clinical data from the MESO-TCGA dataset. We identified TRIM28 and the lncRNA LINC00941 as new vulnerabilities of MPM, associated with disease aggressiveness and bad outcome of patients. TRIM28 is a multi-domain protein involved in many processes, including transcription regulation. We showed that TRIM28 silencing impairs MPM cells’ growth and clonogenicity by blocking cells in mitosis. RNA-seq profiling showed that TRIM28 loss abolished the expression of major mitotic players. Our data suggest that TRIM28 is part of the B-MYB/FOXM1-MuvB complex that specifically drives the activation of mitotic genes, keeping the time of mitosis. In parallel, we found LINC00941 as strongly associated with reduced survival probability in MPM patients. LINC00941 KD profoundly reduced MPM cells’ growth, migration and invasion. This is accompanied by changes in morphology, cytoskeleton organization and cell-cell adhesion properties. RNA-seq profiling showed that LINC00941 KD impacts crucial functions of MPM, including HIF1α signalling. Collectively these data provided new insights into MPM biology and demonstrated that the integration of functional screening with patients’ clinical data is a powerful tool to highlight new non-genetic cancer dependencies that associate to a bad outcome in vivo, paving the way to new MPM-oriented targeted strategies and prognostic tools to improve patients risk-based stratification.
Resumo:
Background: High-throughput SNP genotyping has become an essential requirement for molecular breeding and population genomics studies in plant species. Large scale SNP developments have been reported for several mainstream crops. A growing interest now exists to expand the speed and resolution of genetic analysis to outbred species with highly heterozygous genomes. When nucleotide diversity is high, a refined diagnosis of the target SNP sequence context is needed to convert queried SNPs into high-quality genotypes using the Golden Gate Genotyping Technology (GGGT). This issue becomes exacerbated when attempting to transfer SNPs across species, a scarcely explored topic in plants, and likely to become significant for population genomics and inter specific breeding applications in less domesticated and less funded plant genera. Results: We have successfully developed the first set of 768 SNPs assayed by the GGGT for the highly heterozygous genome of Eucalyptus from a mixed Sanger/454 database with 1,164,695 ESTs and the preliminary 4.5X draft genome sequence for E. grandis. A systematic assessment of in silico SNP filtering requirements showed that stringent constraints on the SNP surrounding sequences have a significant impact on SNP genotyping performance and polymorphism. SNP assay success was high for the 288 SNPs selected with more rigorous in silico constraints; 93% of them provided high quality genotype calls and 71% of them were polymorphic in a diverse panel of 96 individuals of five different species. SNP reliability was high across nine Eucalyptus species belonging to three sections within subgenus Symphomyrtus and still satisfactory across species of two additional subgenera, although polymorphism declined as phylogenetic distance increased. Conclusions: This study indicates that the GGGT performs well both within and across species of Eucalyptus notwithstanding its nucleotide diversity >= 2%. The development of a much larger array of informative SNPs across multiple Eucalyptus species is feasible, although strongly dependent on having a representative and sufficiently deep collection of sequences from many individuals of each target species. A higher density SNP platform will be instrumental to undertake genome-wide phylogenetic and population genomics studies and to implement molecular breeding by Genomic Selection in Eucalyptus.
Resumo:
Background: Analyses of population structure and breed diversity have provided insight into the origin and evolution of cattle. Previously, these studies have used a low density of microsatellite markers, however, with the large number of single nucleotide polymorphism markers that are now available, it is possible to perform genome wide population genetic analyses in cattle. In this study, we used a high-density panel of SNP markers to examine population structure and diversity among eight cattle breeds sampled from Bos indicus and Bos taurus. Results: Two thousand six hundred and forty one single nucleotide polymorphisms ( SNPs) spanning all of the bovine autosomal genome were genotyped in Angus, Brahman, Charolais, Dutch Black and White Dairy, Holstein, Japanese Black, Limousin and Nelore cattle. Population structure was examined using the linkage model in the program STRUCTURE and Fst estimates were used to construct a neighbor-joining tree to represent the phylogenetic relationship among these breeds. Conclusion: The whole-genome SNP panel identified several levels of population substructure in the set of examined cattle breeds. The greatest level of genetic differentiation was detected between the Bos taurus and Bos indicus breeds. When the Bos indicus breeds were excluded from the analysis, genetic differences among beef versus dairy and European versus Asian breeds were detected among the Bos taurus breeds. Exploration of the number of SNP loci required to differentiate between breeds showed that for 100 SNP loci, individuals could only be correctly clustered into breeds 50% of the time, thus a large number of SNP markers are required to replace the 30 microsatellite markers that are currently commonly used in genetic diversity studies.