195 resultados para Copy number variants
em Université de Lausanne, Switzerland
Resumo:
BACKGROUND: Genotypes obtained with commercial SNP arrays have been extensively used in many large case-control or population-based cohorts for SNP-based genome-wide association studies for a multitude of traits. Yet, these genotypes capture only a small fraction of the variance of the studied traits. Genomic structural variants (GSV) such as Copy Number Variation (CNV) may account for part of the missing heritability, but their comprehensive detection requires either next-generation arrays or sequencing. Sophisticated algorithms that infer CNVs by combining the intensities from SNP-probes for the two alleles can already be used to extract a partial view of such GSV from existing data sets. RESULTS: Here we present several advances to facilitate the latter approach. First, we introduce a novel CNV detection method based on a Gaussian Mixture Model. Second, we propose a new algorithm, PCA merge, for combining copy-number profiles from many individuals into consensus regions. We applied both our new methods as well as existing ones to data from 5612 individuals from the CoLaus study who were genotyped on Affymetrix 500K arrays. We developed a number of procedures in order to evaluate the performance of the different methods. This includes comparison with previously published CNVs as well as using a replication sample of 239 individuals, genotyped with Illumina 550K arrays. We also established a new evaluation procedure that employs the fact that related individuals are expected to share their CNVs more frequently than randomly selected individuals. The ability to detect both rare and common CNVs provides a valuable resource that will facilitate association studies exploring potential phenotypic associations with CNVs. CONCLUSION: Our new methodologies for CNV detection and their evaluation will help in extracting additional information from the large amount of SNP-genotyping data on various cohorts and use this to explore structural variants and their impact on complex traits.
Resumo:
Copy number variation (CNV) has recently gained considerable interest as a source of genetic variation likely to play a role in phenotypic diversity and evolution. Much effort has been put into the identification and mapping of regions that vary in copy number among seemingly normal individuals in humans and a number of model organisms, using bioinformatics or hybridization-based methods. These have allowed uncovering associations between copy number changes and complex diseases in whole-genome association studies, as well as identify new genomic disorders. At the genome-wide scale, however, the functional impact of CNV remains poorly studied. Here we review the current catalogs of CNVs, their association with diseases and how they link genotype and phenotype. We describe initial evidence which revealed that genes in CNV regions are expressed at lower and more variable levels than genes mapping elsewhere, and also that CNV not only affects the expression of genes varying in copy number, but also have a global influence on the transcriptome. Further studies are warranted for complete cataloguing and fine mapping of CNVs, as well as to elucidate the different mechanisms by which they influence gene expression.
Resumo:
AbstractAlthough the genomes from any two human individuals are more than 99.99% identical at the sequence level, some structural variation can be observed. Differences between genomes include single nucleotide polymorphism (SNP), inversion and copy number changes (gain or loss of DNA). The latter can range from submicroscopic events (CNVs, at least 1kb in size) to complete chromosomal aneuploidies. Small copy number variations have often no (lethal) consequences to the cell, but a few were associated to disease susceptibility and phenotypic variations. Larger re-arrangements (i.e. complete chromosome gain) are frequently associated with more severe consequences on health such as genomic disorders and cancer. High-throughput technologies like DNA microarrays enable the detection of CNVs in a genome-wide fashion. Since the initial catalogue of CNVs in the human genome in 2006, there has been tremendous interest in CNVs both in the context of population and medical genetics. Understanding CNV patterns within and between human populations is essential to elucidate their possible contribution to disease. But genome analysis is a challenging task; the technology evolves rapidly creating needs for novel, efficient and robust analytical tools which need to be compared with existing ones. Also, while the link between CNV and disease has been established, the relative CNV contribution is not fully understood and the predisposition to disease from CNVs of the general population has not been yet investigated.During my PhD thesis, I worked on several aspects related to CNVs. As l will report in chapter 3, ! was interested in computational methods to detect CNVs from the general population. I had access to the CoLaus dataset, a population-based study with more than 6,000 participants from the Lausanne area. All these individuals were analysed on SNP arrays and extensive clinical information were available. My work explored existing CNV detection methods and I developed a variety of metrics to compare their performance. Since these methods were not producing entirely satisfactory results, I implemented my own method which outperformed two existing methods. I also devised strategies to combine CNVs from different individuals into CNV regions.I was also interested in the clinical impact of CNVs in common disease (chapter 4). Through an international collaboration led by the Centre Hospitalier Universitaire Vaudois (CHUV) and the Imperial College London I was involved as a main data analyst in the investigation of a rare deletion at chromosome 16p11 detected in obese patients. Specifically, we compared 8,456 obese patients and 11,856 individuals from the general population and we found that the deletion was accounting for 0.7% of the morbid obesity cases and was absent in healthy non- obese controls. This highlights the importance of rare variants with strong impact and provides new insights in the design of clinical studies to identify the missing heritability in common disease.Furthermore, I was interested in the detection of somatic copy number alterations (SCNA) and their consequences in cancer (chapter 5). This project was a collaboration initiated by the Ludwig Institute for Cancer Research and involved other groups from the Swiss Institute of Bioinformatics, the CHUV and Universities of Lausanne and Geneva. The focus of my work was to identify genes with altered expression levels within somatic copy number alterations (SCNA) in seven metastatic melanoma ceil lines, using CGH and SNP arrays, RNA-seq, and karyotyping. Very few SCNA genes were shared by even two melanoma samples making it difficult to draw any conclusions at the individual gene level. To overcome this limitation, I used a network-guided analysis to determine whether any pathways, defined by amplified or deleted genes, were common among the samples. Six of the melanoma samples were potentially altered in four pathways and five samples harboured copy-number and expression changes in components of six pathways. In total, this approach identified 28 pathways. Validation with two external, large melanoma datasets confirmed all but three of the detected pathways and demonstrated the utility of network-guided approaches for both large and small datasets analysis.RésuméBien que le génome de deux individus soit similaire à plus de 99.99%, des différences de structure peuvent être observées. Ces différences incluent les polymorphismes simples de nucléotides, les inversions et les changements en nombre de copies (gain ou perte d'ADN). Ces derniers varient de petits événements dits sous-microscopiques (moins de 1kb en taille), appelés CNVs (copy number variants) jusqu'à des événements plus large pouvant affecter des chromosomes entiers. Les petites variations sont généralement sans conséquence pour la cellule, toutefois certaines ont été impliquées dans la prédisposition à certaines maladies, et à des variations phénotypiques dans la population générale. Les réarrangements plus grands (par exemple, une copie additionnelle d'un chromosome appelée communément trisomie) ont des répercutions plus grave pour la santé, comme par exemple dans certains syndromes génomiques et dans le cancer. Les technologies à haut-débit telle les puces à ADN permettent la détection de CNVs à l'échelle du génome humain. La cartographie en 2006 des CNV du génome humain, a suscité un fort intérêt en génétique des populations et en génétique médicale. La détection de différences au sein et entre plusieurs populations est un élément clef pour élucider la contribution possible des CNVs dans les maladies. Toutefois l'analyse du génome reste une tâche difficile, la technologie évolue très rapidement créant de nouveaux besoins pour le développement d'outils, l'amélioration des précédents, et la comparaison des différentes méthodes. De plus, si le lien entre CNV et maladie a été établit, leur contribution précise n'est pas encore comprise. De même que les études sur la prédisposition aux maladies par des CNVs détectés dans la population générale n'ont pas encore été réalisées.Pendant mon doctorat, je me suis concentré sur trois axes principaux ayant attrait aux CNV. Dans le chapitre 3, je détaille mes travaux sur les méthodes d'analyses des puces à ADN. J'ai eu accès aux données du projet CoLaus, une étude de la population de Lausanne. Dans cette étude, le génome de plus de 6000 individus a été analysé avec des puces SNP et de nombreuses informations cliniques ont été récoltées. Pendant mes travaux, j'ai utilisé et comparé plusieurs méthodes de détection des CNVs. Les résultats n'étant pas complètement satisfaisant, j'ai implémenté ma propre méthode qui donne de meilleures performances que deux des trois autres méthodes utilisées. Je me suis aussi intéressé aux stratégies pour combiner les CNVs de différents individus en régions.Je me suis aussi intéressé à l'impact clinique des CNVs dans le cas des maladies génétiques communes (chapitre 4). Ce projet fut possible grâce à une étroite collaboration avec le Centre Hospitalier Universitaire Vaudois (CHUV) et l'Impérial College à Londres. Dans ce projet, j'ai été l'un des analystes principaux et j'ai travaillé sur l'impact clinique d'une délétion rare du chromosome 16p11 présente chez des patients atteints d'obésité. Dans cette collaboration multidisciplinaire, nous avons comparés 8'456 patients atteint d'obésité et 11 '856 individus de la population générale. Nous avons trouvés que la délétion était impliquée dans 0.7% des cas d'obésité morbide et était absente chez les contrôles sains (non-atteint d'obésité). Notre étude illustre l'importance des CNVs rares qui peuvent avoir un impact clinique très important. De plus, ceci permet d'envisager une alternative aux études d'associations pour améliorer notre compréhension de l'étiologie des maladies génétiques communes.Egalement, j'ai travaillé sur la détection d'altérations somatiques en nombres de copies (SCNA) et de leurs conséquences pour le cancer (chapitre 5). Ce projet fut une collaboration initiée par l'Institut Ludwig de Recherche contre le Cancer et impliquant l'Institut Suisse de Bioinformatique, le CHUV et les Universités de Lausanne et Genève. Je me suis concentré sur l'identification de gènes affectés par des SCNAs et avec une sur- ou sous-expression dans des lignées cellulaires dérivées de mélanomes métastatiques. Les données utilisées ont été générées par des puces ADN (CGH et SNP) et du séquençage à haut débit du transcriptome. Mes recherches ont montrées que peu de gènes sont récurrents entre les mélanomes, ce qui rend difficile l'interprétation des résultats. Pour contourner ces limitations, j'ai utilisé une analyse de réseaux pour définir si des réseaux de signalisations enrichis en gènes amplifiés ou perdus, étaient communs aux différents échantillons. En fait, parmi les 28 réseaux détectés, quatre réseaux sont potentiellement dérégulés chez six mélanomes, et six réseaux supplémentaires sont affectés chez cinq mélanomes. La validation de ces résultats avec deux larges jeux de données publiques, a confirmée tous ces réseaux sauf trois. Ceci démontre l'utilité de cette approche pour l'analyse de petits et de larges jeux de données.Résumé grand publicL'avènement de la biologie moléculaire, en particulier ces dix dernières années, a révolutionné la recherche en génétique médicale. Grâce à la disponibilité du génome humain de référence dès 2001, de nouvelles technologies telles que les puces à ADN sont apparues et ont permis d'étudier le génome dans son ensemble avec une résolution dite sous-microscopique jusque-là impossible par les techniques traditionnelles de cytogénétique. Un des exemples les plus importants est l'étude des variations structurales du génome, en particulier l'étude du nombre de copies des gènes. Il était établi dès 1959 avec l'identification de la trisomie 21 par le professeur Jérôme Lejeune que le gain d'un chromosome supplémentaire était à l'origine de syndrome génétique avec des répercussions graves pour la santé du patient. Ces observations ont également été réalisées en oncologie sur les cellules cancéreuses qui accumulent fréquemment des aberrations en nombre de copies (telles que la perte ou le gain d'un ou plusieurs chromosomes). Dès 2004, plusieurs groupes de recherches ont répertorié des changements en nombre de copies dans des individus provenant de la population générale (c'est-à-dire sans symptômes cliniques visibles). En 2006, le Dr. Richard Redon a établi la première carte de variation en nombre de copies dans la population générale. Ces découvertes ont démontrées que les variations dans le génome était fréquentes et que la plupart d'entre elles étaient bénignes, c'est-à-dire sans conséquence clinique pour la santé de l'individu. Ceci a suscité un très grand intérêt pour comprendre les variations naturelles entre individus mais aussi pour mieux appréhender la prédisposition génétique à certaines maladies.Lors de ma thèse, j'ai développé de nouveaux outils informatiques pour l'analyse de puces à ADN dans le but de cartographier ces variations à l'échelle génomique. J'ai utilisé ces outils pour établir les variations dans la population suisse et je me suis consacré par la suite à l'étude de facteurs pouvant expliquer la prédisposition aux maladies telles que l'obésité. Cette étude en collaboration avec le Centre Hospitalier Universitaire Vaudois a permis l'identification d'une délétion sur le chromosome 16 expliquant 0.7% des cas d'obésité morbide. Cette étude a plusieurs répercussions. Tout d'abord elle permet d'effectuer le diagnostique chez les enfants à naître afin de déterminer leur prédisposition à l'obésité. Ensuite ce locus implique une vingtaine de gènes. Ceci permet de formuler de nouvelles hypothèses de travail et d'orienter la recherche afin d'améliorer notre compréhension de la maladie et l'espoir de découvrir un nouveau traitement Enfin notre étude fournit une alternative aux études d'association génétique qui n'ont eu jusqu'à présent qu'un succès mitigé.Dans la dernière partie de ma thèse, je me suis intéressé à l'analyse des aberrations en nombre de copies dans le cancer. Mon choix s'est porté sur l'étude de mélanomes, impliqués dans le cancer de la peau. Le mélanome est une tumeur très agressive, elle est responsable de 80% des décès des cancers de la peau et est souvent résistante aux traitements utilisés en oncologie (chimiothérapie, radiothérapie). Dans le cadre d'une collaboration entre l'Institut Ludwig de Recherche contre le Cancer, l'Institut Suisse de Bioinformatique, le CHUV et les universités de Lausanne et Genève, nous avons séquencés l'exome (les gènes) et le transcriptome (l'expression des gènes) de sept mélanomes métastatiques, effectués des analyses du nombre de copies par des puces à ADN et des caryotypes. Mes travaux ont permis le développement de nouvelles méthodes d'analyses adaptées au cancer, d'établir la liste des réseaux de signalisation cellulaire affectés de façon récurrente chez le mélanome et d'identifier deux cibles thérapeutiques potentielles jusqu'alors ignorées dans les cancers de la peau.
Resumo:
Large, rare copy number variants (CNVs) have been implicated in a variety of psychiatric disorders, but the role of CNVs in recurrent depression is unclear. We performed a genome-wide analysis of large, rare CNVs in 3106 cases of recurrent depression, 459 controls screened for lifetime-absence of psychiatric disorder and 5619 unscreened controls from phase 2 of the Wellcome Trust Case Control Consortium (WTCCC2). We compared the frequency of cases with CNVs against the frequency observed in each control group, analysing CNVs over the whole genome, genic, intergenic, intronic and exonic regions. We found that deletion CNVs were associated with recurrent depression, whereas duplications were not. The effect was significant when comparing cases with WTCCC2 controls (P=7.7 × 10(-6), odds ratio (OR) =1.25 (95% confidence interval (CI) 1.13-1.37)) and to screened controls (P=5.6 × 10(-4), OR=1.52 (95% CI 1.20-1.93). Further analysis showed that CNVs deleting protein coding regions were largely responsible for the association. Within an analysis of regions previously implicated in schizophrenia, we found an overall enrichment of CNVs in our cases when compared with screened controls (P=0.019). We observe an ordered increase of samples with deletion CNVs, with the lowest proportion seen in screened controls, the next highest in unscreened controls and the highest in cases. This may suggest that the absence of deletion CNVs, especially in genes, is associated with resilience to recurrent depression.
Resumo:
Copy-number variants (CNVs) represent a significant interpretative challenge, given that each CNV typically affects the dosage of multiple genes. Here we report on five individuals with coloboma, microcephaly, developmental delay, short stature, and craniofacial, cardiac, and renal defects who harbor overlapping microdeletions on 8q24.3. Fine mapping localized a commonly deleted 78 kb region that contains three genes: SCRIB, NRBP2, and PUF60. In vivo dissection of the CNV showed discrete contributions of the planar cell polarity effector SCRIB and the splicing factor PUF60 to the syndromic phenotype, and the combinatorial suppression of both genes exacerbated some, but not all, phenotypic components. Consistent with these findings, we identified an individual with microcephaly, short stature, intellectual disability, and heart defects with a de novo c.505C>T variant leading to a p.His169Tyr change in PUF60. Functional testing of this allele in vivo and in vitro showed that the mutation perturbs the relative dosage of two PUF60 isoforms and, subsequently, the splicing efficiency of downstream PUF60 targets. These data inform the functions of two genes not associated previously with human genetic disease and demonstrate how CNVs can exhibit complex genetic architecture, with the phenotype being the amalgam of both discrete dosage dysfunction of single transcripts and also of binary genetic interactions.
Resumo:
Copy number variants (CNVs) are major contributors to genetic disorders. We have dissected a region of the 16p11.2 chromosome--which encompasses 29 genes--that confers susceptibility to neurocognitive defects when deleted or duplicated. Overexpression of each human transcript in zebrafish embryos identified KCTD13 as the sole message capable of inducing the microcephaly phenotype associated with the 16p11.2 duplication, whereas suppression of the same locus yielded the macrocephalic phenotype associated with the 16p11.2 deletion, capturing the mirror phenotypes of humans. Analyses of zebrafish and mouse embryos suggest that microcephaly is caused by decreased proliferation of neuronal progenitors with concomitant increase in apoptosis in the developing brain, whereas macrocephaly arises by increased proliferation and no changes in apoptosis. A role for KCTD13 dosage changes is consistent with autism in both a recently reported family with a reduced 16p11.2 deletion and a subject reported here with a complex 16p11.2 rearrangement involving de novo structural alteration of KCTD13. Our data suggest that KCTD13 is a major driver for the neurodevelopmental phenotypes associated with the 16p11.2 CNV, reinforce the idea that one or a small number of transcripts within a CNV can underpin clinical phenotypes, and offer an efficient route to identifying dosage-sensitive loci.
Resumo:
BACKGROUND: Defining the molecular genomic basis of the likelihood of developing depressive disorder is a considerable challenge. We previously associated rare, exonic deletion copy number variants (CNV) with recurrent depressive disorder (RDD). Sex chromosome abnormalities also have been observed to co-occur with RDD. METHODS: In this reanalysis of our RDD dataset (N = 3106 cases; 459 screened control samples and 2699 population control samples), we further investigated the role of larger CNVs and chromosomal abnormalities in RDD and performed association analyses with clinical data derived from this dataset. RESULTS: We found an enrichment of Turner's syndrome among cases of depression compared with the frequency observed in a large population sample (N = 34,910) of live-born infants collected in Denmark (two-sided p = .023, odds ratio = 7.76 [95% confidence interval = 1.79-33.6]), a case of diploid/triploid mosaicism, and several cases of uniparental isodisomy. In contrast to our previous analysis, large deletion CNVs were no more frequent in cases than control samples, although deletion CNVs in cases contained more genes than control samples (two-sided p = .0002). CONCLUSIONS: After statistical correction for multiple comparisons, our data do not support a substantial role for CNVs in RDD, although (as has been observed in similar samples) occasional cases may harbor large variants with etiological significance. Genetic pleiotropy and sample heterogeneity suggest that very large sample sizes are required to study conclusively the role of genetic variation in mood disorders.
Resumo:
The functional consequences of structural variation in the human genome range from adaptation, to phenotypic variation, to predisposition to diseases. Copy number variation (CNV) was shown to influence the phenotype by modifying, in a somewhat dose-dependent manner, the expression of genes that map within them, as well as that of genes located on their flanks. To assess the possible mechanism(s) behind this neighboring effect, we compared histone modification status of cell lines from patients affected by Williams-Beuren, Williams-Beuren region duplication, Smith-Magenis or DiGeorge Syndrome and control individuals using a high-throughput version of chromatin immuno-precipitation method (ChIP), called ChlP-seq. We monitored monomethylation of lysine K20 on histone H4 and trimethylation of lysine K27 on histone H3, as proxies for open and condensed chromatin, respectively. Consistent with the changes in expression levels observed for multiple genes mapping on the entire length of chromosomes affected by structural variants, we also detected regions with modified histone status between samples, up- and downstream from the critical regions, up to the end of the rearranged chromosome. We also gauged the intrachromosomal interactions of these cell lines utilizing chromosome conformation capture (4C-seq) technique. We observed that a set of genes flanking the Williams-Beuren Syndrome critical region (WBSCR) were often looping together, possibly forming an interacting cluster with each other and the WBSCR. Deletion of the WBSCR disrupts the expression of this group of flanking genes, as well as long-range interactions between them and the rearranged interval. We conclude, that large genomic rearrangements can lead to changes in the state of the chromatin spreading far away from the critical region, thus possibly affecting expression globally and as a result modifying the phenotype of the patients. - Les conséquences fonctionnelles des variations structurelles dans le génome humain sont vastes, allant de l'adaptation, en passant par les variations phénotypiques, aux prédispositions à certaines maladies. Il a été démontré que les variations du nombre de copies (CNV) influencent le phénotype en modifiant, d'une manière plus ou moins dose-dépendante, l'expression des gènes se situant à l'intérieur de ces régions, mais également celle des gènes se trouvant dans les régions flanquantes. Afin d'étudier les mécanismes possibles sous-jacents à cet effet de voisinage, nous avons comparé les états de modification des histones dans des lignées cellulaires dérivées de patients atteints du syndrome de Williams-Beuren, de la duplication de la région Williams-Beuren, du syndrome de Smith-Magenis ou du syndrome de Di- George et d'individus contrôles en utilisant une version haut-débit de la méthode d'immunoprécipitation de la chromatine (ChIP), appelée ChIP-seq. Nous avons suivi la mono-méthylation de la lysine K20 sur l'histone H4 et la tri-méthylation de la lysine K27 sur l'histone H3, marqueurs respectifs de la chromatine ouverte et fermée. En accord avec les changements de niveaux d'expression observés pour de multiples gènes tout le long des chromosomes affectés par les CNVs, nous avons aussi détecté des régions présentant des modifications d'histones entre les échantillons, situées de part et d'autre des régions critiques, jusqu'aux extrémités du chromosome réarrangé. Nous avons aussi évalué les interactions intra-chromosomiques ayant lieu dans ces cellules par l'utilisation de la technique de capture de conformation des chromosomes (4C-seq). Nous avons observé qu'un groupe de gènes flanquants la région critique du syndrome de Williams-Beuren (WBSCR) forment souvent une boucle, constituant un groupe d'interactions privilégiées entre ces gènes et la WBSCR. La délétion de la WBSCR perturbe l'expression de ce groupe de gènes flanquants, mais également les interactions à grande échelle entre eux et la région réarrangée. Nous en concluons que les larges réarrangements génomiques peuvent aboutir à des changements de l'état de la chromatine pouvant s'étendre bien plus loin que la région critique, affectant donc potentiellement l'expression de manière globale et ainsi modifiant le phénotype des patients.
Resumo:
A genome-wide screen for large structural variants showed that a copy number variant (CNV) in the region encoding killer cell immunoglobulin-like receptors (KIR) associates with HIV-1 control as measured by plasma viral load at set point in individuals of European ancestry. This CNV encompasses the KIR3DL1-KIR3DS1 locus, encoding receptors that interact with specific HLA-Bw4 molecules to regulate the activation of lymphocyte subsets including natural killer (NK) cells. We quantified the number of copies of KIR3DS1 and KIR3DL1 in a large HIV-1 positive cohort, and showed that an increase in KIR3DS1 count associates with a lower viral set point if its putative ligand is present (p = 0.00028), as does an increase in KIR3DL1 count in the presence of KIR3DS1 and appropriate ligands for both receptors (p = 0.0015). We further provide functional data that demonstrate that NK cells from individuals with multiple copies of KIR3DL1, in the presence of KIR3DS1 and the appropriate ligands, inhibit HIV-1 replication more robustly, and associated with a significant expansion in the frequency of KIR3DS1+, but not KIR3DL1+, NK cells in their peripheral blood. Our results suggest that the relative amounts of these activating and inhibitory KIR play a role in regulating the peripheral expansion of highly antiviral KIR3DS1+ NK cells, which may determine differences in HIV-1 control following infection.
Resumo:
IMPORTANCE: The association of copy number variations (CNVs), differing numbers of copies of genetic sequence at locations in the genome, with phenotypes such as intellectual disability has been almost exclusively evaluated using clinically ascertained cohorts. The contribution of these genetic variants to cognitive phenotypes in the general population remains unclear. OBJECTIVE: To investigate the clinical features conferred by CNVs associated with known syndromes in adult carriers without clinical preselection and to assess the genome-wide consequences of rare CNVs (frequency ≤0.05%; size ≥250 kilobase pairs [kb]) on carriers' educational attainment and intellectual disability prevalence in the general population. DESIGN, SETTING, AND PARTICIPANTS: The population biobank of Estonia contains 52,000 participants enrolled from 2002 through 2010. General practitioners examined participants and filled out a questionnaire of health- and lifestyle-related questions, as well as reported diagnoses. Copy number variant analysis was conducted on a random sample of 7877 individuals and genotype-phenotype associations with education and disease traits were evaluated. Our results were replicated on a high-functioning group of 993 Estonians and 3 geographically distinct populations in the United Kingdom, the United States, and Italy. MAIN OUTCOMES AND MEASURES: Phenotypes of genomic disorders in the general population, prevalence of autosomal CNVs, and association of these variants with educational attainment (from less than primary school through scientific degree) and prevalence of intellectual disability. RESULTS: Of the 7877 in the Estonian cohort, we identified 56 carriers of CNVs associated with known syndromes. Their phenotypes, including cognitive and psychiatric problems, epilepsy, neuropathies, obesity, and congenital malformations are similar to those described for carriers of identical rearrangements ascertained in clinical cohorts. A genome-wide evaluation of rare autosomal CNVs (frequency, ≤0.05%; ≥250 kb) identified 831 carriers (10.5%) of the screened general population. Eleven of 216 (5.1%) carriers of a deletion of at least 250 kb (odds ratio [OR], 3.16; 95% CI, 1.51-5.98; P = 1.5e-03) and 6 of 102 (5.9%) carriers of a duplication of at least 1 Mb (OR, 3.67; 95% CI, 1.29-8.54; P = .008) had an intellectual disability compared with 114 of 6819 (1.7%) in the Estonian cohort. The mean education attainment was 3.81 (P = 1.06e-04) among 248 (≥250 kb) deletion carriers and 3.69 (P = 5.024e-05) among 115 duplication carriers (≥1 Mb). Of the deletion carriers, 33.5% did not graduate from high school (OR, 1.48; 95% CI, 1.12-1.95; P = .005) and 39.1% of duplication carriers did not graduate high school (OR, 1.89; 95% CI, 1.27-2.8; P = 1.6e-03). Evidence for an association between rare CNVs and lower educational attainment was supported by analyses of cohorts of adults from Italy and the United States and adolescents from the United Kingdom. CONCLUSIONS AND RELEVANCE: Known pathogenic CNVs in unselected, but assumed to be healthy, adult populations may be associated with unrecognized clinical sequelae. Additionally, individually rare but collectively common intermediate-size CNVs may be negatively associated with educational attainment. Replication of these findings in additional population groups is warranted given the potential implications of this observation for genomics research, clinical care, and public health.
Resumo:
A large fraction of genome variation between individuals is comprised of submicroscopic copy number variation of genomic DNA segments. We assessed the relative contribution of structural changes and gene dosage alterations on phenotypic outcomes with mouse models of Smith-Magenis and Potocki-Lupski syndromes. We phenotyped mice with 1n (Deletion/+), 2n (+/+), 3n (Duplication/+), and balanced 2n compound heterozygous (Deletion/Duplication) copies of the same region. Parallel to the observations made in humans, such variation in gene copy number was sufficient to generate phenotypic consequences: in a number of cases diametrically opposing phenotypes were associated with gain versus loss of gene content. Surprisingly, some neurobehavioral traits were not rescued by restoration of the normal gene copy number. Transcriptome profiling showed that a highly significant propensity of transcriptional changes map to the engineered interval in the five assessed tissues. A statistically significant overrepresentation of the genes mapping to the entire length of the engineered chromosome was also found in the top-ranked differentially expressed genes in the mice containing rearranged chromosomes, regardless of the nature of the rearrangement, an observation robust across different cell lineages of the central nervous system. Our data indicate that a structural change at a given position of the human genome may affect not only locus and adjacent gene expression but also "genome regulation." Furthermore, structural change can cause the same perturbation in particular pathways regardless of gene dosage. Thus, the presence of a genomic structural change, as well as gene dosage imbalance, contributes to the ultimate phenotype.
Resumo:
Arbuscular mycorrhizal fungi (AMF) are ancient asexually reproducing organisms that form symbioses with the majority of plant species, improving plant nutrition and promoting plant diversity. Little is known about the evolution or organization of the genomes of any eukaryotic symbiont or ancient asexual organism. Direct evidence shows that one AMF species is heterokaryotic; that is, containing populations of genetically different nuclei. It has been suggested, however, that the genetic variation passed from generation to generation in AMF is simply due to multiple chromosome sets (that is, high ploidy). Here we show that previously documented genetic variation in Pol-like sequences, which are passed from generation to generation, cannot be due to either high ploidy or repeated gene duplications. Our results provide the clearest evidence so far for substantial genetic differences among nuclei in AMF. We also show that even AMF with a very large nuclear DNA content are haploid. An underlying principle of evolutionary theory is that an individual passes on one or half of its genome to each of its progeny. The coexistence of a population of many genomes in AMF and their transfer to subsequent generations, therefore, has far-reaching consequences for understanding genome evolution.
Resumo:
Abstract : Copy number variation (CNV) of DNA segments has recently gained considerable interest as a source of genetic variation likely to play a role in phenotypic diversity and evolution. Much effort has been put into the identification and mapping of regions that vary in copy number among seemingly normal individuals, both in humans and in a number of model organisms, using both bioinformatic and hybridization-based methods. Synteny studies suggest the existence of CNV hotspots in mammalian genomes, often in connection with regions of segmental duplication. CNV alleles can be in equilibrium within a population, but can also arise de novo between generations, illustrating the highly dynamic nature of these regions. A small number of studies have assessed the effect of CNV on single loci, however, at the genome-wide scale, the functional impact of CNV remains poorly studied. We have explored the influence of CNV on gene expression, first using the Williams-Beuren syndrome (WBS) associated deletion as a model, and second at the genome-wide scale in inbred mouse strains. We found that the WBS deletion influences the expression levels not only of the hemizygous genes, but also affects the euploid genes mapping nearby. Consistently, on a genome wide scale we observe that CNV genes are expressed at more variable levels than genes that do not vary in copy number. Likewise, CNVs influence the relative expression levels of genes that map to the flank of the genome rearrangements, thus globally influencing tissue transcriptomes. Further studies are warranted to complete cataloguing and fine mapping of CNV regions, as well as to elucidate the different mechanisms by which CNVs influence gene expression. Résumé : La variation en nombre de copies (copy number variation ou CNV) de segments d'ADN suscite un intérêt en tant que variation génétique susceptible de jouer un r81e dans la diversité phénotypique et l'évolution. Les régions variables en nombre de copies parmi des individus apparemment normaux ont été cartographiées et cataloguées au moyen de puces à ADN et d'analyse bioinformatique. L'étude de la synténie entre plusieurs espèces de mammifères laisse supposer l'existence de régions à haut taux de variation, souvent liées à des duplications segmentaires. Les allèles CNV peuvent être en équilibre au sein d'une population ou peuvent apparaître de novo. Ces faits illustrent la nature hautement dynamique de ces régions. Quelques études se sont penchées sur l'effet de la variation en nombre de copies de loci isolés, cependant l'impact de ce phénomène n'a pas été étudié à l'échelle génomique. Nous avons examiné l'influence des CNV sur l'expression des gènes. Dans un premier temps nous avons utilisé la délétion associée au syndrome de Williams-Beuren (WBS), puis, dans un second temps, nous avons poursuivi notre étude à l'échelle du génome, dans des lignées consanguines de souris. Nous avons établi que la délétion WBS influence l'expression non seulement des gènes hémizygotes, mais également celle des gènes euploïdes voisins. A l'échelle génomique, nous observons des phénomènes concordants. En effet, l'expression des gènes variant en nombre de copies est plus variable que celles des gènes ne variant pas. De plus, à l'instar de la délétion WBS, les CNV influencent l'expression des gènes adjacents, exerçant ainsi un impact global sur les profils d'expression dans les tissus. Résumé pour un large public : De nombreuses maladies ont pour cause un défaut génétique. Parmi les types de mutations, on compte la disparition (délétion) d'une partie de notre génome ou sa duplication. Bien que l'on connaisse les anomalies associées à certaines maladies, les mécanismes moléculaires par lesquels ces réarrangements de notre matériel génétique induisent les maladies sont encore méconnus. C'est pourquoi nous nous sommes intéressés à la régulation des gènes dans les régions susceptibles à délétion ou duplication. Dans ce travail, nous avons démontré que les délétions et les duplications influencent la régulation des gènes situés à proximité, et que ces changements interviennent dans plusieurs organes.
Resumo:
To develop a comprehensive overview of copy number aberrations (CNAs) in stage-II/III colorectal cancer (CRC), we characterized 302 tumors from the PETACC-3 clinical trial. Microsatellite-stable (MSS) samples (n = 269) had 66 minimal common CNA regions, with frequent gains on 20 q (72.5%), 7 (41.8%), 8 q (33.1%) and 13 q (51.0%) and losses on 18 (58.6%), 4 q (26%) and 21 q (21.6%). MSS tumors have significantly more CNAs than microsatellite-instable (MSI) tumors: within the MSI tumors a novel deletion of the tumor suppressor WWOX at 16 q23.1 was identified (p<0.01). Focal aberrations identified by the GISTIC method confirmed amplifications of oncogenes including EGFR, ERBB2, CCND1, MET, and MYC, and deletions of tumor suppressors including TP53, APC, and SMAD4, and gene expression was highly concordant with copy number aberration for these genes. Novel amplicons included putative oncogenes such as WNK1 and HNF4A, which also showed high concordance between copy number and expression. Survival analysis associated a specific patient segment featured by chromosome 20 q gains to an improved overall survival, which might be due to higher expression of genes such as EEF1B2 and PTK6. The CNA clustering also grouped tumors characterized by a poor prognosis BRAF-mutant-like signature derived from mRNA data from this cohort. We further revealed non-random correlation between CNAs among unlinked loci, including positive correlation between 20 q gain and 8 q gain, and 20 q gain and chromosome 18 loss, consistent with co-selection of these CNAs. These results reinforce the non-random nature of somatic CNAs in stage-II/III CRC and highlight loci and genes that may play an important role in driving the development and outcome of this disease.
Resumo:
Gene copy number polymorphism was studied in a population of the arbuscular mycorrhizal fungus Glomus intraradices by using a quantitative PCR approach on four different genomic regions. Variation in gene copy number was found for a pseudogene and for three ribosomal genes, providing conclusive evidence for a widespread occurrence of macromutational events in the population.