Biblioteca Digital

959 resultados para GENOME SEQUENCING

EasyStrata: evaluation and visualization of stratified genome-wide association meta-analysis data.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The R package EasyStrata facilitates the evaluation and visualization of stratified genome-wide association meta-analyses (GWAMAs) results. It provides (i) statistical methods to test and account for between-strata difference as a means to tackle gene-strata interaction effects and (ii) extended graphical features tailored for stratified GWAMA results. The software provides further features also suitable for general GWAMAs including functions to annotate, exclude or highlight specific loci in plots or to extract independent subsets of loci from genome-wide datasets. It is freely available and includes a user-friendly scripting interface that simplifies data handling and allows for combining statistical and graphical functions in a flexible fashion. AVAILABILITY: EasyStrata is available for free (under the GNU General Public License v3) from our Web site www.genepi-regensburg.de/easystrata and from the CRAN R package repository cran.r-project.org/web/packages/EasyStrata/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Molecular genetics of narcolepsy

Relevância:

20.00% 20.00%

Publicador:

Resumo:

1 Abstract Sleep is a vital necessity, yet its basic physiological function is still unknown, despite numerous studies both in healthy humans and animal models. The study of patients with sleep disorders may help uncover major biological pathways in sleep regulation and thus shed light on the actual function of sleep. Narcolepsy is a well defined but rare sleep disorder characterized by excessive daytime sleepiness and cataplexy, thought to be caused by a combination of genetic and environmental factors. The aim of this work was to identify genes or genetic variants, which contribute to the pathogenesis of sporadic and familial narcolepsy. Sporadic narcolepsy is the disorder with the strongest human leukocyte antigen (HLA) association ever reported. Since the associated HLA-DRB1 *1501-DQB1 *0602 haplotype is common in the general population (15-25%), it has been suggested that it is necessary but not sufficient for developing narcolepsy. To further define the genetic basis of narcolepsy risk, we performed a genome-wide association study (GWAS) in 562 European individuals with narcolepsy (cases) and 702 ethnically matched controls, with independent replication in 370 cases and 495 controls, all heterozygous for DRB1*1501-DQB1*0602. We found association with a protective variant near HLA-DQA2. Further analysis revealed that the identified SNP is strongly linked to DRB1*03-DQB1*02 and DRBΠ 301-DQB1*0603. Cases almost never carried a trans DRB1*1301-DQB1*0603 haplotype. This unexpected protective HLA haplotype suggests a causal involvement of the HLA region in narcolepsy susceptibility. Familial cases of narcolepsy account for 10% of all narcolepsy cases. However, due to low number of affected family members, narcolepsy families are usually not eligible for genetic linkage studies. We identified and characterized a large Spanish family with 11 affected family members representing the largest ever reported narcolepsy family. We ran a genetic linkage analysis using DNA of 11 affected and 15 unaffected family members and hereby identified a chromosomal candidate region on chromosome 6 encompassing 163 kb with a maximum multipoint LOD score of 5.02. The coding sequences of 4 genes within this haplotype block as well as 2 neighboring genes were screened for pathogenetic mutations in 2 affected and 1 healthy family members. So far no pathogenic mutation could be identified. Further in-depth sequencing of our candidate region as well as whole genome exome sequencing are underway to identify the pathogenic mutation(s) in this family and will further improve our understanding of the genetic basis of narcolepsy. 2 Résumé Le sommeil est un processus vital, dont la fonction physiologique est encore inconnue, malgré de nombreuses études chez des sujets humains sains ainsi que dans des modèles animaux. L'étude de patients souffrant de troubles du sommeil peut permettre la découverte de voies biologiques jouant un rôle majeur dans la régulation du sommeil. L'un de ces troubles, la narcolepsie, est une maladie rare mais néanmoins bien définie, caractérisée par une somnolence diurne excessive accompagnée de cataplexies. Les connaissances actuelles suggèrent qu'une combinaison de facteurs génétiques et environnementaux en est à l'origine. Le but du présent travail était d'identifier !e(s) gène(s) ou les polymorphismes constituant des facteurs de risque dans les formes sporadique et familiale de narcolepsie. La narcolepsie sporadique est la maladie possédant la plus forte association avec le complexe majeur d'histocompatibilité humain (HLA) jamais reportée. La fréquence au sein de la population générale de l'haplotype associé HLA-DRB1*1501- DQB1*0602 (15-25%) suggère que ce dernier est nécessaire, mais pas suffisant, pour (e développement de la maladie. Nous avons voulu approfondir la recherche de facteurs génétiques augmentant le risque de la narcolepsie. A cette fin, nous avons entrepris une étude d'association à l'échelle du génome (genome-wide association study, GWAS) parmi 562 sujets narcoleptiques européens (cas) et 702 individus contrôle de même origine ethnique et nous avons trouvé une association avec un variant protecteur près du gène HLA- DQA2. Ce résultat a été répliqué indépendamment dans 370 cas et 495 contrôles, tous hétérozygotes au locus DRB1*1501-DQB1*0602. Une analyse plus fine montre que le polymorphisme identifié est fortement lié aux allèles DRB1*03-DQB1*02 et DRB1*1301-DQB1*0603. Nous notons que seul un cas était porteur d'un haplotype en trans DRB1*1301-DQBr0603. La découverte de cet allele HLA protecteur suggère que la région HLA joue un rôle causal dans la susceptibilité à la narcolepsie. Dix pourcents des cas de narcolepsie sont familiaux. Cependant, le faible nombre de membres affectés rend ces familles inéligibles pour des études de liaison génétique. Nous avons identifié et caractérisé une grande famille espagnole, dont 11 membres sont atteints par la maladie, ce qui représente la plus grande famille narcoleptique rapportée jusqu'à ce jour. A partir de l'ADN de 11 membres atteints et 15 non- atteints, nous avons identifié par étude de liaison une région candidate de 163 kîlobases (kb) sur le chromosome 6, correspondant à un LOD score multipoints de 5.02. Nous avons cherché, sans succès, des mutations pathogéniques dans la séquence codante de deux gènes situés à l'intérieur de ce segment, ainsi que 4 gènes adjacents. Un séquençage plus approfondi de la région ainsi que le séquençage des exons de tout le génome est en cours et doit s'avérer plus fructueux et révéler la ou tes mutation(s) pathogénique(s) dans cette famille, ce qui contribuerait à une meilleure compréhension des causes génétiques de la narcolepsie. 3 Résumé pour un large public Le sommeil est une nécessité vitale, dont le rôle physiologique exact reste inconnu malgré de nombreuses études sur des sujets humains sains ainsi que sur des modèles animaux. C'est pourquoi les troubles du sommeil intéressent les chercheurs, car l'élucidation des mécanismes responsables peut permettre de mieux comprendre le fonctionnement du sommeil normal. La narcolepsie est une maladie du sommeil caractérisée par une somnolence diurne excessive. Les personnes atteintes peuvent s'endormir involontairement à tout moment de la journée, et souffrent également de pertes du tonus musculaire (cataplexie) lors de fortes émotions, par exemple un fou rire. La narcolepsie est une maladie rare, apparaissant dans 1 personne sur 2000. Les connaissances actuelles suggèrent qu'une combinaison de facteurs génétiques et environnementaux en est à l'origine. Nous avons voulu identifier les facteurs génétiques influençant le déclenchement de la maladie, d'abord dans sa forme sporadique, puis dans une famille comptant de nombreux membres atteints. En comparant les variations génétiques de près de 1000 sujets narcoleptiques européens avec ceux de 1200 individus sains, nous avons trouvé chez 30% de ces derniers un variant protecteur, qui diminue de 50 fois le risque de développer la maladie, ce qui constitue le plus puissant facteur génétique protecteur décrit à ce jour. Nous avons ensuite étudié une grande famille espagnole comptant une trentaine de membres, dont 11 sont atteints de narcolepsie. De nouveau, nous avons comparé les variations génétiques des membres atteints avec ceux des membres sains. Nous avons ainsi pu identifier une région dans le génome où se trouverait le(s) gène(s) impliqué(s) dans la maladie dans cette famille, mais n'avons pas encore trouvé le(s) variant(s) exact(s). Une étude plus approfondie devrait permettre de P(les) identifier et ainsi contribuer à l'élucidation des mécanismes menant au développement de la narcolepsie.

Genetic algorithm for sequencing in midex model non-permutation flowshops using constrained buffers

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Este trabajo presenta un Algoritmo Genético (GA) del problema de secuenciar unidades en una línea de producción. Se tiene en cuenta la posibilidad de cambiar la secuencia de piezas mediante estaciones con acceso a un almacén intermedio o centralizado. El acceso al almacén además está restringido, debido al tamaño de las piezas.AbstractThis paper presents a Genetic Algorithm (GA) for the problem of sequencing in a mixed model non-permutation flowshop. Resequencingis permitted where stations have access to intermittent or centralized resequencing buffers. The access to a buffer is restricted by the number of available buffer places and the physical size of the products.

The complete genome sequence of the gram-positive bacterium Bacillus subtilis.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Bacillus subtilis is the best-characterized member of the Gram-positive bacteria. Its genome of 4,214,810 base pairs comprises 4,100 protein-coding genes. Of these protein-coding genes, 53% are represented once, while a quarter of the genome corresponds to several gene families that have been greatly expanded by gene duplication, the largest family containing 77 putative ATP-binding transport proteins. In addition, a large proportion of the genetic capacity is devoted to the utilization of a variety of carbon sources, including many plant-derived molecules. The identification of five signal peptidase genes, as well as several genes for components of the secretion apparatus, is important given the capacity of Bacillus strains to secrete large amounts of industrially important enzymes. Many of the genes are involved in the synthesis of secondary metabolites, including antibiotics, that are more typically associated with Streptomyces species. The genome contains at least ten prophages or remnants of prophages, indicating that bacteriophage infection has played an important evolutionary role in horizontal gene transfer, in particular in the propagation of bacterial pathogenesis.

Complete DNA sequence of Kuraishia capsulata illustrates novel genomic features among budding yeasts (Saccharomycotina)

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The numerous yeast genome sequences presently available provide a rich source of information for functional as well as evolutionary genomics but unequally cover the large phylogenetic diversity of extant yeasts. We present here the complete sequence of the nuclear genome of the haploid-type strain of Kuraishia capsulata (CBS1993(T)), a nitrate-assimilating Saccharomycetales of uncertain taxonomy, isolated from tunnels of insect larvae underneath coniferous barks and characterized by its copious production of extracellular polysaccharides. The sequence is composed of seven scaffolds, one per chromosome, totaling 11.4 Mb and containing 6,029 protein-coding genes, ~13.5% of which being interrupted by introns. This GC-rich yeast genome (45.7%) appears phylogenetically related with the few other nitrate-assimilating yeasts sequenced so far, Ogataea polymorpha, O. parapolymorpha, and Dekkera bruxellensis, with which it shares a very reduced number of tRNA genes, a novel tRNA sparing strategy, and a common nitrate assimilation cluster, three specific features to this group of yeasts. Centromeres were recognized in GC-poor troughs of each scaffold. The strain bears MAT alpha genes at a single MAT locus and presents a significant degree of conservation with Saccharomyces cerevisiae genes, suggesting that it can perform sexual cycles in nature, although genes involved in meiosis were not all recognized. The complete absence of conservation of synteny between K. capsulata and any other yeast genome described so far, including the three other nitrate-assimilating species, validates the interest of this species for long-range evolutionary genomic studies among Saccharomycotina yeasts.

PhylomeDB v4: zooming into the plurality of evolutionary histories of a genome

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Phylogenetic trees representing the evolutionary relationships of homologous genes are the entry point for many evolutionary analyses. For instance, the use of a phylogenetic tree can aid in the inference of orthology and paralogy relationships, and in the detection of relevant evolutionary events such as gene family expansions and contractions, horizontal gene transfer, recombination or incomplete lineage sorting. Similarly, given the plurality of evolutionary histories among genes encoded in a given genome, there is a need for the combined analysis of genome-wide collections of phylogenetic trees (phylomes). Here, we introduce a new release of PhylomeDB (http://phylomedb.org), a public repository of phylomes. Currently, PhylomeDB hosts 120 public phylomes, comprising >1.5 million maximum likelihood trees and multiple sequence alignments. In the current release, phylogenetic trees are annotated with taxonomic, protein-domain arrangement, functional and evolutionary information. PhylomeDB is also a major source for phylogeny-based predictions of orthology and paralogy, covering >10 million proteins across 1059 sequenced species. Here we describe newly implemented PhylomeDB features, and discuss a benchmark of the orthology predictions provided by the database, the impact of proteome updates and the use of the phylome approach in the analysis of newly sequenced genomes and transcriptomes.

The genome of the recently domesticated crop plant sugar beet (Beta vulgaris)

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Sugar beet (Beta vulgaris ssp. vulgaris) is an important crop of temperate climates which provides nearly 30% of the world's annual sugar production and is a source for bioethanol and animal feed. The species belongs to the order of Caryophylalles, is diploid with 2n = 18 chromosomes, has an estimated genome size of 714-758 megabases and shares an ancient genome triplication with other eudicot plants. Leafy beets have been cultivated since Roman times, but sugar beet is one of the most recently domesticated crops. It arose in the late eighteenth century when lines accumulating sugar in the storage root were selected from crosses made with chard and fodder beet. Here we present a reference genome sequence for sugar beet as the first non-rosid, non-asterid eudicot genome, advancing comparative genomics and phylogenetic reconstructions. The genome sequence comprises 567 megabases, of which 85% could be assigned to chromosomes. The assembly covers a large proportion of the repetitive sequence content that was estimated to be 63%. We predicted 27,421 protein-coding genes supported by transcript data and annotated them on the basis of sequence homology. Phylogenetic analyses provided evidence for the separation of Caryophyllales before the split of asterids and rosids, and revealed lineage-specific gene family expansions and losses. We sequenced spinach (Spinacia oleracea), another Caryophyllales species, and validated features that separate this clade from rosids and asterids. Intraspecific genomic variation was analysed based on the genome sequences of sea beet (Beta vulgaris ssp. maritima; progenitor of all beet crops) and four additional sugar beet accessions. We identified seven million variant positions in the reference genome, and also large regions of low variability, indicating artificial selection. The sugar beet genome sequence enables the identification of genes affecting agronomically relevant traits, supports molecular breeding and maximizes the plant's potential in energy biotechnology.

Targeted sequence capture and Ultra High Throughput sequencing for gene discovery in inherited diseases

Relevância:

20.00% 20.00%

Publicador:

Resumo:

L'introduction des technologies de séquençage de nouvelle génération est en vue de révolutionner la médecine moderne. L'impact de ces nouveaux outils a déjà contribué à la découverte de nouveaux gènes et de voies cellulaires impliqués dans la pathologie de maladies génétiques rares ou communes. En revanche, l'énorme quantité de données générées par ces systèmes ainsi que la complexité des analyses bioinformatiques nécessaires, engendre un goulet d'étranglement pour résoudre les cas les plus difficiles. L'objectif de cette thèse a été d'identifier les causes génétiques de deux maladies héréditaires utilisant ces nouvelles techniques de séquençage, couplées à des technologies d'enrichissement de gènes. Dans ce cadre, nous avons développé notre propre méthode de travail (pipeline) pour l'alignement des fragments de séquence (reads). Suite à l'identification de gènes, nous avons réalisé une analyse fonctionnelle pour élucider leur rôle dans la maladie. Dans un premier temps, nous avons étudié et identifié des mutations impliquées dans une forme récessive de la rétinite pigmentaire qui est à ce jour la dégénérescence rétinienne héréditaire la plus fréquente. En particulier, nous avons constaté que des mutations faux-sens dans le gène FAM161A étaient la cause de la rétinite pigmentaire préalablement associé avec le locus RP28. De plus, nous avons démontré que ce gène avait des fonctions au niveau du cil du photorécepteur, complétant le large spectre des cilliopathies rétiniennes héréditaires. Dans un second temps, nous avons exploré la possibilité qu'un syndrome, relativement fréquent en pédiatrie de fièvre récurrente, appelé PFAPA (acronyme de fièvre périodique avec adénite stomatite, pharyngite et cervical aphteuse) puisse avoir une origine génétique. L'étiologie de cette maladie n'étant pas claire, nous avons tenté d'identifier le spectre génétique de patients PFAPA. Comme nous n'avons pas pu mettre à jour un nouveau gène unique muté et responsable de la maladie chez tous les individus dépistés, il semblerait qu'un modèle génétique plus complexe suggérant l'implication de plusieurs gènes dans la pathologie ait été identifié chez les patients touchés. Ces gènes seraient notamment impliqués dans des processus liés à l'inflammation ce qui élargirait l'impact de ces études à d'autres maladies auto-inflammatoires.

Genome-wide meta-analysis identifies 11 new loci for anthropometric traits and provides insights into genetic architecture.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Approaches exploiting trait distribution extremes may be used to identify loci associated with common traits, but it is unknown whether these loci are generalizable to the broader population. In a genome-wide search for loci associated with the upper versus the lower 5th percentiles of body mass index, height and waist-to-hip ratio, as well as clinical classes of obesity, including up to 263,407 individuals of European ancestry, we identified 4 new loci (IGFBP4, H6PD, RSRC1 and PPP2R2A) influencing height detected in the distribution tails and 7 new loci (HNF4G, RPTOR, GNAT2, MRPS33P4, ADCY9, HS6ST3 and ZZZ3) for clinical classes of obesity. Further, we find a large overlap in genetic structure and the distribution of variants between traits based on extremes and the general population and little etiological heterogeneity between obesity subgroups.

Comparative analysis of the transcriptome across distant species.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The transcriptome is the readout of the genome. Identifying common features in it across distant species can reveal fundamental principles. To this end, the ENCODE and modENCODE consortia have generated large amounts of matched RNA-sequencing data for human, worm and fly. Uniform processing and comprehensive annotation of these data allow comparison across metazoan phyla, extending beyond earlier within-phylum transcriptome comparisons and revealing ancient, conserved features. Specifically, we discover co-expression modules shared across animals, many of which are enriched in developmental genes. Moreover, we use expression patterns to align the stages in worm and fly development and find a novel pairing between worm embryo and fly pupae, in addition to the embryo-to-embryo and larvae-to-larvae pairings. Furthermore, we find that the extent of non-canonical, non-coding transcription is similar in each organism, per base pair. Finally, we find in all three organisms that the gene-expression levels, both coding and non-coding, can be quantitatively predicted from chromatin features at the promoter using a 'universal model' based on a single set of organism-independent parameters.

Genome-wide prediction of matrix attachment regions that increase gene expression in mammalian cells.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Gene transfer in eukaryotic cells and organisms suffers from epigenetic effects that result in low or unstable transgene expression and high clonal variability. Use of epigenetic regulators such as matrix attachment regions (MARs) is a promising approach to alleviate such unwanted effects. Dissection of a known MAR allowed the identification of sequence motifs that mediate elevated transgene expression. Bioinformatics analysis implied that these motifs adopt a curved DNA structure that positions nucleosomes and binds specific transcription factors. From these observations, we computed putative MARs from the human genome. Cloning of several predicted MARs indicated that they are much more potent than the previously known element, boosting the expression of recombinant proteins from cultured cells as well as mediating high and sustained expression in mice. Thus we computationally identified potent epigenetic regulators, opening new strategies toward high and stable transgene expression for research, therapeutic production or gene-based therapies.

Origin and genome evolution of polyploid green toads in Central Asia: evidence from microsatellite markers.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Polyploidization, which is expected to trigger major genomic reorganizations, occurs much less commonly in animals than in plants, possibly because of constraints imposed by sex-determination systems. We investigated the origins and consequences of allopolyploidization in Palearctic green toads (Bufo viridis subgroup) from Central Asia, with three ploidy levels and different modes of genome transmission (sexual versus clonal), to (i) establish a topology for the reticulate phylogeny in a species-rich radiation involving several closely related lineages and (ii) explore processes of genomic reorganization that may follow polyploidization. Sibship analyses based on 30 cross-amplifying microsatellite markers substantiated the maternal origins and revealed the paternal origins and relationships of subgenomes in allopolyploids. Analyses of the synteny of linkage groups identified three markers affected by translocation events, which occurred only within the paternally inherited subgenomes of allopolyploid toads and exclusively affected the linkage group that determines sex in several diploid species of the green toad radiation. Recombination rates did not differ between diploid and polyploid toad species, and were overall much reduced in males, independent of linkage group and ploidy levels. Clonally transmitted subgenomes in allotriploid toads provided support for strong genetic drift, presumably resulting from recombination arrest. The Palearctic green toad radiation seems to offer unique opportunities to investigate the consequences of polyploidization and clonal transmission on the dynamics of genomes in vertebrates.

Quadrant/octant sequencing and the role of coherent structures in bed load sediment entrainment

Relevância:

20.00% 20.00%

Publicador:

Resumo:

To permit the tracking of turbulent flow structures in an Eulerian frame from single-point measurements, we make use of a generalization of conventional two-dimensional quadrant analysis to three-dimensional octants. We characterize flow structures using the sequences of these octants and show how significance may be attached to particular sequences using statistical mull models. We analyze an example experiment and show how a particular dominant flow structure can be identified from the conditional probability of octant sequences. The frequency of this structure corresponds to the dominant peak in the velocity spectra and exerts a high proportion of the total shear stress. We link this structure explicitly to the propensity for sediment entrainment and show that greater insight into sediment entrainment can be obtained by disaggregating those octants that occur within the identified macroturbulence structure from those that do not. Hence, this work goes beyond critiques of Reynolds stress approaches to bed load entrainment that highlight the importance of outward interactions, to identifying and prioritizing the quadrants/octants that define particular flow structures. Key Points <list list-type=''bulleted'' id=''jgrf20196-list-0001''> <list-item id=''jgrf20196-li-0001''>A new method for analysing single point velocity data is presented <list-item id=''jgrf20196-li-0002''>Flow structures are identified by a sequence of flow states (termed octants) <list-item id=''jgrf20196-li-0003''>The identified structure exerts high stresses and causes bed-load entrainment

The role of tributary relative timing and sequencing in controlling large floods

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Hydrograph convolution is a product of tributary inputs from across the watershed. The time-space distribution of precipitation, the biophysical processes that control the conversion of precipitation to runoff and channel flow conveyance processes, are heterogeneous and different areas respond to rainfall in different ways. We take a subwatershed approach to this and account for tributary flow magnitude, relative timing, and sequencing. We hypothesize that as the scale of the watershed increases so we may start to see systematic differences in subwatershed hydrological response. We test this hypothesis for a large flood (T >100 years) in a large watershed in northern England. We undertake a sensitivity analysis of the effects of changing subwatershed hydrological response using a hydraulic model. Delaying upstream tributary peak flow timing to make them asynchronous from downstream subwatersheds reduced flood magnitude. However, significant hydrograph adjustment in any one subwatershed was needed for meaningful reductions in stage downstream, although smaller adjustments in multiple tributaries resulted in comparable impacts. For larger hydrograph adjustments, the effect of changing the timing of two tributaries together was lower than the effect of changing each one separately. For smaller adjustments synergy between two subwatersheds meant the effect of changing them together could be greater than the sum of the parts. Thus, this work shows that while the effects of modifying biophysical catchment properties diminishes with scale due to dilution effects, their impact on relative timing of tributaries may, if applied in the right locations, be an important element of flood management.

Global Diversity Lines-A Five-Continent Reference Panel of Sequenced Drosophila melanogaster Strains.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Reference collections of multiple Drosophila lines with accumulating collections of "omics" data have proven especially valuable for the study of population genetics and complex trait genetics. Here we present a description of a resource collection of 84 strains of Drosophila melanogaster whose genome sequences were obtained after 12 generations of full-sib inbreeding. The initial rationale for this resource was to foster development of a systems biology platform for modeling metabolic regulation by the use of natural polymorphisms as perturbations. As reference lines, they are amenable to repeated phenotypic measurements, and already a large collection of metabolic traits have been assayed. Another key feature of these strains is their widespread geographic origin, coming from Beijing, Ithaca, Netherlands, Tasmania, and Zimbabwe. After obtaining 12.5× coverage of paired-end Illumina sequence reads, SNP and indel calls were made with the GATK platform. Thorough quality control was enabled by deep sequencing one line to >100×, and single-nucleotide polymorphisms and indels were validated using ddRAD-sequencing as an orthogonal platform. In addition, a series of preliminary population genetic tests were performed with these single-nucleotide polymorphism data for assessment of data quality. We found 83 segregating inversions among the lines, and as expected these were especially abundant in the African sample. We anticipate that this will make a useful addition to the set of reference D. melanogaster strains, thanks to its geographic structuring and unusually high level of genetic diversity.

«
1
2
...
54
55
56
57
58
59
60
...
63
64
»