962 resultados para Genome wide mapping
Resumo:
The choice of genotyping families vs unrelated individuals is a critical factor in any large-scale linkage disequilibrium (LD) study. The use of unrelated individuals for such studies is promising, but in contrast to family designs, unrelated samples do not facilitate detection of genotyping errors, which have been shown to be of great importance for LD and linkage studies and may be even more important in genotyping collaborations across laboratories. Here we employ some of the most commonly-used analysis methods to examine the relative accuracy of haplotype estimation using families vs unrelateds in the presence of genotyping error. The results suggest that even slight amounts of genotyping error can significantly decrease haplotype frequency and reconstruction accuracy, that the ability to detect such errors in large families is essential when the number/complexity of haplotypes is high (low LD/common alleles). In contrast, in situations of low haplotype complexity (high LD and/or many rare alleles) unrelated individuals offer such a high degree of accuracy that there is little reason for less efficient family designs. Moreover, parent-child trios, which comprise the most popular family design and the most efficient in terms of the number of founder chromosomes per genotype but which contain little information for error detection, offer little or no gain over unrelated samples in nearly all cases, and thus do not seem a useful sampling compromise between unrelated individuals and large families. The implications of these results are discussed in the context of large-scale LD mapping projects such as the proposed genome-wide haplotype map.
Resumo:
Dramatic improvements in DNA sequencing technologies have led to amore than 1,000-fold reduction in sequencing costs over the past five years.Genome-wide research approaches can thus now be applied beyond medicallyrelevant questions to examine the molecular-genetic basis of behavior,development and unique life histories in almost any organism. A first step foran emerging model organism is usually establishing a reference genomesequence. I offer insight gained from the fire ant genome project. First, I detailhow the project came to be and how sequencing, assembly and annotationstrategies were chosen. Subsequently, I describe some of the issues linked toworking with data from recently sequenced genomes. Finally, I discuss anapproach undertaken in a follow-up project based on the fire ant genomesequence.
Resumo:
STUDY OBJECTIVE: Prior research has identified five common genetic variants associated with narcolepsy with cataplexy in Caucasian patients. To replicate and/or extend these findings, we have tested HLA-DQB1, the previously identified 5 variants, and 10 other potential variants in a large European sample of narcolepsy with cataplexy subjects. DESIGN: Retrospective case-control study. SETTING: A recent study showed that over 76% of significant genome-wide association variants lie within DNase I hypersensitive sites (DHSs). From our previous GWAS, we identified 30 single nucleotide polymorphisms (SNPs) with P < 10(-4) mapping to DHSs. Ten SNPs tagging these sites, HLADQB1, and all previously reported SNPs significantly associated with narcolepsy were tested for replication. PATIENTS AND PARTICIPANTS: For GWAS, 1,261 narcolepsy patients and 1,422 HLA-DQB1*06:02-matched controls were included. For HLA study, 1,218 patients and 3,541 controls were included. MEASUREMENTS AND RESULTS: None of the top variants within DHSs were replicated. Out of the five previously reported SNPs, only rs2858884 within the HLA region (P < 2x10(-9)) and rs1154155 within the TRA locus (P < 2x10(-8)) replicated. DQB1 typing confirmed that DQB1*06:02 confers an extraordinary risk (odds ratio 251). Four protective alleles (DQB1*06:03, odds ratio 0.17, DQB1*05:01, odds ratio 0.56, DQB1*06:09 odds ratio 0.21, DQB1*02 odds ratio 0.76) were also identified. CONCLUSION: An overwhelming portion of genetic risk for narcolepsy with cataplexy is found at DQB1 locus. Since DQB1*06:02 positive subjects are at 251-fold increase in risk for narcolepsy, and all recent cases of narcolepsy after H1N1 vaccination are positive for this allele, DQB1 genotyping may be relevant to public health policy.
Resumo:
Waist-hip ratio (WHR) is a measure of body fat distribution and a predictor of metabolic consequences independent of overall adiposity. WHR is heritable, but few genetic variants influencing this trait have been identified. We conducted a meta-analysis of 32 genome-wide association studies for WHR adjusted for body mass index (comprising up to 77,167 participants), following up 16 loci in an additional 29 studies (comprising up to 113,636 subjects). We identified 13 new loci in or near RSPO3, VEGFA, TBX15-WARS2, NFE2L3, GRB14, DNM3-PIGC, ITPR2-SSPN, LY86, HOXC13, ADAMTS9, ZNRF3-KREMEN1, NISCH-STAB1 and CPEB4 (P = 1.9 × 10⁻⁹ to P = 1.8 × 10⁻⁴⁰) and the known signal at LYPLAL1. Seven of these loci exhibited marked sexual dimorphism, all with a stronger effect on WHR in women than men (P for sex difference = 1.9 × 10⁻³ to P = 1.2 × 10⁻&supl;³). These findings provide evidence for multiple loci that modulate body fat distribution independent of overall adiposity and reveal strong gene-by-sex interactions.
Resumo:
Abstract : Copy number variation (CNV) of DNA segments has recently gained considerable interest as a source of genetic variation likely to play a role in phenotypic diversity and evolution. Much effort has been put into the identification and mapping of regions that vary in copy number among seemingly normal individuals, both in humans and in a number of model organisms, using both bioinformatic and hybridization-based methods. Synteny studies suggest the existence of CNV hotspots in mammalian genomes, often in connection with regions of segmental duplication. CNV alleles can be in equilibrium within a population, but can also arise de novo between generations, illustrating the highly dynamic nature of these regions. A small number of studies have assessed the effect of CNV on single loci, however, at the genome-wide scale, the functional impact of CNV remains poorly studied. We have explored the influence of CNV on gene expression, first using the Williams-Beuren syndrome (WBS) associated deletion as a model, and second at the genome-wide scale in inbred mouse strains. We found that the WBS deletion influences the expression levels not only of the hemizygous genes, but also affects the euploid genes mapping nearby. Consistently, on a genome wide scale we observe that CNV genes are expressed at more variable levels than genes that do not vary in copy number. Likewise, CNVs influence the relative expression levels of genes that map to the flank of the genome rearrangements, thus globally influencing tissue transcriptomes. Further studies are warranted to complete cataloguing and fine mapping of CNV regions, as well as to elucidate the different mechanisms by which CNVs influence gene expression. Résumé : La variation en nombre de copies (copy number variation ou CNV) de segments d'ADN suscite un intérêt en tant que variation génétique susceptible de jouer un r81e dans la diversité phénotypique et l'évolution. Les régions variables en nombre de copies parmi des individus apparemment normaux ont été cartographiées et cataloguées au moyen de puces à ADN et d'analyse bioinformatique. L'étude de la synténie entre plusieurs espèces de mammifères laisse supposer l'existence de régions à haut taux de variation, souvent liées à des duplications segmentaires. Les allèles CNV peuvent être en équilibre au sein d'une population ou peuvent apparaître de novo. Ces faits illustrent la nature hautement dynamique de ces régions. Quelques études se sont penchées sur l'effet de la variation en nombre de copies de loci isolés, cependant l'impact de ce phénomène n'a pas été étudié à l'échelle génomique. Nous avons examiné l'influence des CNV sur l'expression des gènes. Dans un premier temps nous avons utilisé la délétion associée au syndrome de Williams-Beuren (WBS), puis, dans un second temps, nous avons poursuivi notre étude à l'échelle du génome, dans des lignées consanguines de souris. Nous avons établi que la délétion WBS influence l'expression non seulement des gènes hémizygotes, mais également celle des gènes euploïdes voisins. A l'échelle génomique, nous observons des phénomènes concordants. En effet, l'expression des gènes variant en nombre de copies est plus variable que celles des gènes ne variant pas. De plus, à l'instar de la délétion WBS, les CNV influencent l'expression des gènes adjacents, exerçant ainsi un impact global sur les profils d'expression dans les tissus. Résumé pour un large public : De nombreuses maladies ont pour cause un défaut génétique. Parmi les types de mutations, on compte la disparition (délétion) d'une partie de notre génome ou sa duplication. Bien que l'on connaisse les anomalies associées à certaines maladies, les mécanismes moléculaires par lesquels ces réarrangements de notre matériel génétique induisent les maladies sont encore méconnus. C'est pourquoi nous nous sommes intéressés à la régulation des gènes dans les régions susceptibles à délétion ou duplication. Dans ce travail, nous avons démontré que les délétions et les duplications influencent la régulation des gènes situés à proximité, et que ces changements interviennent dans plusieurs organes.
Resumo:
Recently, the introduction of second generation sequencing and further advance-ments in confocal microscopy have enabled system-level studies for the functional characterization of genes. The degree of complexity intrinsic to these approaches needs the development of bioinformatics methodologies and computational models for extracting meaningful biological knowledge from the enormous amount of experi¬mental data which is continuously generated. This PhD thesis presents several novel bioinformatics methods and computational models to address specific biological questions in Plant Biology by using the plant Arabidopsis thaliana as a model system. First, a spatio-temporal qualitative analysis of quantitative transcript and protein profiles is applied to show the role of the BREVIS RADIX (BRX) protein in the auxin- cytokinin crosstalk for root meristem growth. Core of this PhD work is the functional characterization of the interplay between the BRX protein and the plant hormone auxin in the root meristem by using a computational model based on experimental evidence. Hyphotesis generated by the modelled to the discovery of a differential endocytosis pattern in the root meristem that splits the auxin transcriptional response via the plasma membrane to nucleus partitioning of BRX. This positional information system creates an auxin transcriptional pattern that deviates from the canonical auxin response and is necessary to sustain the expression of a subset of BRX-dependent auxin-responsive genes to drive root meristem growth. In the second part of this PhD thesis, we characterized the genome-wide impact of large scale deletions on four divergent Arabidopsis natural strains, through the integration of Ultra-High Throughput Sequencing data with data from genomic hybridizations on tiling arrays. Analysis of the identified deletions revealed a considerable portion of protein coding genes affected and supported a history of genomic rearrangements shaped by evolution. In the last part of the thesis, we showed that VIP3 gene in Arabidopsis has an evo-lutionary conserved role in the 3' to 5' mRNA degradation machinery, by applying a novel approach for the analysis of mRNA-Seq data from random-primed mRNA. Altogether, this PhD research contains major advancements in the study of natural genomic variation in plants and in the application of computational morphodynamics models for the functional characterization of biological pathways essential for the plant. - Récemment, l'introduction du séquençage de seconde génération et les avancées dans la microscopie confocale ont permis des études à l'échelle des différents systèmes cellulaires pour la caractérisation fonctionnelle de gènes. Le degrés de complexité intrinsèque à ces approches ont requis le développement de méthodologies bioinformatiques et de modèles mathématiques afin d'extraire de la masse de données expérimentale générée, des information biologiques significatives. Ce doctorat présente à la fois des méthodes bioinformatiques originales et des modèles mathématiques pour répondre à certaines questions spécifiques de Biologie Végétale en utilisant la plante Arabidopsis thaliana comme modèle. Premièrement, une analyse qualitative spatio-temporelle de profiles quantitatifs de transcripts et de protéines est utilisée pour montrer le rôle de la protéine BREVIS RADIX (BRX) dans le dialogue entre l'auxine et les cytokinines, des phytohormones, dans la croissance du méristème racinaire. Le noyau de ce travail de thèse est la caractérisation fonctionnelle de l'interaction entre la protéine BRX et la phytohormone auxine dans le méristème de la racine en utilisant des modèles informatiques basés sur des preuves expérimentales. Les hypothèses produites par le modèle ont mené à la découverte d'un schéma différentiel d'endocytose dans le méristème racinaire qui divise la réponse transcriptionnelle à l'auxine par le partitionnement de BRX de la membrane plasmique au noyau de la cellule. Cette information positionnelle crée une réponse transcriptionnelle à l'auxine qui dévie de la réponse canonique à l'auxine et est nécessaire pour soutenir l'expression d'un sous ensemble de gènes répondant à l'auxine et dépendant de BRX pour conduire la croissance du méristème. Dans la seconde partie de cette thèse de doctorat, nous avons caractérisé l'impact sur l'ensemble du génome des délétions à grande échelle sur quatre souches divergentes naturelles d'Arabidopsis, à travers l'intégration du séquençage à ultra-haut-débit avec l'hybridation génomique sur puces ADN. L'analyse des délétions identifiées a révélé qu'une proportion considérable de gènes codant était affectée, supportant l'idée d'un historique de réarrangement génomique modelé durant l'évolution. Dans la dernière partie de cette thèse, nous avons montré que le gène VÏP3 dans Arabidopsis a conservé un rôle évolutif dans la machinerie de dégradation des ARNm dans le sens 3' à 5', en appliquant une nouvelle approche pour l'analyse des données de séquençage d'ARNm issue de transcripts amplifiés aléatoirement. Dans son ensemble, cette recherche de doctorat contient des avancées majeures dans l'étude des variations génomiques naturelles des plantes et dans l'application de modèles morphodynamiques informatiques pour la caractérisation de réseaux biologiques essentiels à la plante. - Le développement des plantes est écrit dans leurs codes génétiques. Pour comprendre comment les plantes sont capables de s'adapter aux changements environnementaux, il est essentiel d'étudier comment leurs gènes gouvernent leur formation. Plus nous essayons de comprendre le fonctionnement d'une plante, plus nous réalisons la complexité des mécanismes biologiques, à tel point que l'utilisation d'outils et de modèles mathématiques devient indispensable. Dans ce travail, avec l'utilisation de la plante modèle Arabidopsis thalicinci nous avons résolu des problèmes biologiques spécifiques à travers le développement et l'application de méthodes informatiques concrètes. Dans un premier temps, nous avons investigué comment le gène BREVIS RADIX (BRX) régule le développement de la racine en contrôlant la réponse à deux hormones : l'auxine et la cytokinine. Nous avons employé une analyse statistique sur des mesures quantitatives de transcripts et de produits de gènes afin de démontrer que BRX joue un rôle antagonisant dans le dialogue entre ces deux hormones. Lorsque ce-dialogue moléculaire est perturbé, la racine primaire voit sa longueur dramatiquement réduite. Pour comprendre comment BRX répond à l'auxine, nous avons développé un modèle informatique basé sur des résultats expérimentaux. Les simulations successives ont mené à la découverte d'un signal positionnel qui contrôle la réponse de la racine à l'auxine par la régulation du mouvement intracellulaire de BRX. Dans la seconde partie de cette thèse, nous avons analysé le génome entier de quatre souches naturelles d'Arabidopsis et nous avons trouvé qu'une grande partie de leurs gènes étaient manquant par rapport à la souche de référence. Ce résultat indique que l'historique des modifications génomiques conduites par l'évolution détermine une disponibilité différentielle des gènes fonctionnels dans ces plantes. Dans la dernière partie de ce travail, nous avons analysé les données du transcriptome de la plante où le gène VIP3 était non fonctionnel. Ceci nous a permis de découvrir le rôle double de VIP3 dans la régulation de l'initiation de la transcription et dans la dégradation des transcripts. Ce rôle double n'avait jusqu'alors été démontrée que chez l'homme. Ce travail de doctorat supporte le développement et l'application de méthodologies informatiques comme outils inestimables pour résoudre la complexité des problèmes biologiques dans la recherche végétale. L'intégration de la biologie végétale et l'informatique est devenue de plus en plus importante pour l'avancée de nos connaissances sur le fonctionnement et le développement des plantes.
Resumo:
BACKGROUND & AIMS: Recently, genetic variations in MICA (lead single nucleotide polymorphism [SNP] rs2596542) were identified by a genome-wide association study (GWAS) to be associated with hepatitis C virus (HCV)-related hepatocellular carcinoma (HCC) in Japanese patients. In the present study, we sought to determine whether this SNP is predictive of HCC development in the Caucasian population as well. METHODS: An extended region around rs2596542 was genotyped in 1924 HCV-infected patients from the Swiss Hepatitis C Cohort Study (SCCS). Pair-wise correlation between key SNPs was calculated both in the Japanese and European populations (HapMap3: CEU and JPT). RESULTS: To our surprise, the minor allele A of rs2596542 in proximity of MICA appeared to have a protective impact on HCC development in Caucasians, which represents an inverse association as compared to the one observed in the Japanese population. Detailed fine-mapping analyses revealed a new SNP in HCP5 (rs2244546) upstream of MICA as strong predictor of HCV-related HCC in the SCCS (univariable p=0.027; multivariable p=0.0002, odds ratio=3.96, 95% confidence interval=1.90-8.27). This newly identified SNP had a similarly directed effect on HCC in both Caucasian and Japanese populations, suggesting that rs2244546 may better tag a putative true variant than the originally identified SNPs. CONCLUSIONS: Our data confirms the MICA/HCP5 region as susceptibility locus for HCV-related HCC and identifies rs2244546 in HCP5 as a novel tagging SNP. In addition, our data exemplify the need for conducting meta-analyses of cohorts of different ethnicities in order to fine map GWAS signals.
Resumo:
Lancelets ('amphioxus') are the modern survivors of an ancient chordate lineage, with a fossil record dating back to the Cambrian period. Here we describe the structure and gene content of the highly polymorphic approximately 520-megabase genome of the Florida lancelet Branchiostoma floridae, and analyse it in the context of chordate evolution. Whole-genome comparisons illuminate the murky relationships among the three chordate groups (tunicates, lancelets and vertebrates), and allow not only reconstruction of the gene complement of the last common chordate ancestor but also partial reconstruction of its genomic organization, as well as a description of two genome-wide duplications and subsequent reorganizations in the vertebrate lineage. These genome-scale events shaped the vertebrate genome and provided additional genetic variation for exploitation during vertebrate evolution.
Resumo:
The characterization of expressed sequence tags (ESTs) generated from a cDNA library of Leishmania (Leishmania) amazonensis amastigotes is described. The sequencing of 93 clones generated new L. (L.) amazonensis ESTs from which 32% are not related to any other sequences in database and 68% presented significant similarities to known genes. The chromosome localization of some L. (L.) amazonensis ESTs was also determined in L. (L.) amazonensis and L. (L.) major. The characterization of these ESTs is suitable for the genome physical mapping, as well as for the identification of genes encoding cysteine proteinases implicated with protective immune responses in leishmaniasis.
Resumo:
BACKGROUND: IL-2 receptor (IL2R) alpha is the specific component of the high affinity IL2R system involved in the immune response and in the control of autoimmunity. METHODS AND RESULTS: Here we perform a replication and fine mapping of the IL2RA gene region analyzing 3 SNPs previously associated with multiple sclerosis (MS) and 5 SNPs associated with type 1 diabetes (T1D) in a collection of 798 MS patients and 927 matched Caucasian controls from the south of Spain. We observed association with MS in 6 of 8 SNPs. The rs1570538, at the 3'- UTR extreme of the gene, previously reported to have a weak association with MS, is replicated here (P = 0.032). The most associated T1D SNP (rs41295061) was not associated with MS in the present study. However, the rs35285258, belonging to another independent group of SNPs associated with T1D, showed the maximal association in this study but different risk allele. We replicated the association of only one (rs2104286) of the two IL2RA SNPs identified in the recently performed genome-wide association study of MS. CONCLUSIONS: These findings confirm and extend the association of this gene with MS and reveal a genetic heterogeneity of the associated polymorphisms and risk alleles between MS and T1D suggesting different immunopathological roles of IL2RA in these two diseases.
Resumo:
Platelets are the second most abundant cell type in blood and are essential for maintaining haemostasis. Their count and volume are tightly controlled within narrow physiological ranges, but there is only limited understanding of the molecular processes controlling both traits. Here we carried out a high-powered meta-analysis of genome-wide association studies (GWAS) in up to 66,867 individuals of European ancestry, followed by extensive biological and functional assessment. We identified 68 genomic loci reliably associated with platelet count and volume mapping to established and putative novel regulators of megakaryopoiesis and platelet formation. These genes show megakaryocyte-specific gene expression patterns and extensive network connectivity. Using gene silencing in Danio rerio and Drosophila melanogaster, we identified 11 of the genes as novel regulators of blood cell formation. Taken together, our findings advance understanding of novel gene functions controlling fate-determining events during megakaryopoiesis and platelet formation, providing a new example of successful translation of GWAS to function.
Resumo:
Human genetics has progressed at an unprecedented pace during the past 10 years. DNA microarrays currently allow screening of the entire human genome with high level of coverage and we are now entering the era of high-throughput sequencing. These remarkable technical advances are influencing the way medical research is conducted and have boosted our understanding of the structure of the human genome as well as of disease biology. In this context, it is crucial for clinicians to understand the main concepts and limitations of modern genetics. This review will describe key concepts in genetics, including the different types of genetic markers in the human genome, review current methods to detect DNA variation, describe major online public databases in genetics, explain key concepts in statistical genetics and finally present commonly used study designs in clinical and epidemiological research. This review will therefore concentrate on human genetic variation analysis.
Resumo:
Most approaches aiming at finding genes involved in adaptive events have focused on the detection of outlier loci, which resulted in the discovery of individually "significant" genes with strong effects. However, a collection of small effect mutations could have a large effect on a given biological pathway that includes many genes, and such a polygenic mode of adaptation has not been systematically investigated in humans. We propose here to evidence polygenic selection by detecting signals of adaptation at the pathway or gene set level instead of analyzing single independent genes. Using a gene-set enrichment test to identify genome-wide signals of adaptation among human populations, we find that most pathways globally enriched for signals of positive selection are either directly or indirectly involved in immune response. We also find evidence for long-distance genotypic linkage disequilibrium, suggesting functional epistatic interactions between members of the same pathway. Our results show that past interactions with pathogens have elicited widespread and coordinated genomic responses, and suggest that adaptation to pathogens can be considered as a primary example of polygenic selection.
Resumo:
Background: Cells have the ability to respond and adapt to environmental changes through activation of stress-activated protein kinases (SAPKs). Although p38 SAPK signalling is known to participate in the regulation of gene expression little is known on the molecular mechanisms used by this SAPK to regulate stress-responsive genes and the overall set of genes regulated by p38 in response to different stimuli.Results: Here, we report a whole genome expression analyses on mouse embryonic fibroblasts (MEFs) treated with three different p38 SAPK activating-stimuli, namely osmostress, the cytokine TNFα and the protein synthesis inhibitor anisomycin. We have found that the activation kinetics of p38α SAPK in response to these insults is different and also leads to a complex gene pattern response specific for a given stress with a restricted set of overlapping genes. In addition, we have analysed the contribution of p38α the major p38 family member present in MEFs, to the overall stress-induced transcriptional response by using both a chemical inhibitor (SB203580) and p38α deficient (p38α-/-) MEFs. We show here that p38 SAPK dependency ranged between 60% and 88% depending on the treatments and that there is a very good overlap between the inhibitor treatment and the ko cells. Furthermore, we have found that the dependency of SAPK varies depending on the time the cells are subjected to osmostress. Conclusions: Our genome-wide transcriptional analyses shows a selective response to specific stimuli and a restricted common response of up to 20% of the stress up-regulated early genes that involves an important set of transcription factors, which might be critical for either cell adaptation or preparation for continuous extra-cellular changes. Interestingly, up to 85% of the up-regulated genes are under the transcriptional control of p38 SAPK. Thus, activation of p38 SAPK is critical to elicit the early gene expression program required for cell adaptation to stress.
Resumo:
Background: Prolificacy is the most important trait influencing the reproductive efficiency of pig production systems. The low heritability and sex-limited expression of prolificacy have hindered to some extent the improvement of this trait through artificial selection. Moreover, the relative contributions of additive, dominant and epistatic QTL to the genetic variance of pig prolificacy remain to be defined. In this work, we have undertaken this issue by performing one-dimensional and bi-dimensional genome scans for number of piglets born alive (NBA) and total number of piglets born (TNB) in a three generation Iberian by Meishan F2 intercross. Results: The one-dimensional genome scan for NBA and TNB revealed the existence of two genome-wide highly significant QTL located on SSC13 (P < 0.001) and SSC17 (P < 0.01) with effects on both traits. This relative paucity of significant results contrasted very strongly with the wide array of highly significant epistatic QTL that emerged in the bi-dimensional genome-wide scan analysis. As much as 18 epistatic QTL were found for NBA (four at P < 0.01 and five at P < 0.05) and TNB (three at P < 0.01 and six at P < 0.05), respectively. These epistatic QTL were distributed in multiple genomic regions, which covered 13 of the 18 pig autosomes, and they had small individual effects that ranged between 3 to 4% of the phenotypic variance. Different patterns of interactions (a × a, a × d, d × a and d × d) were found amongst the epistatic QTL pairs identified in the current work.Conclusions: The complex inheritance of prolificacy traits in pigs has been evidenced by identifying multiple additive (SSC13 and SSC17), dominant and epistatic QTL in an Iberian × Meishan F2 intercross. Our results demonstrate that a significant fraction of the phenotypic variance of swine prolificacy traits can be attributed to first-order gene-by-gene interactions emphasizing that the phenotypic effects of alleles might be strongly modulated by the genetic background where they segregate.