846 resultados para Whole genome sequencing
Resumo:
Forward genetic screens have identified numerous genes involved in development and metabolism, and remain a cornerstone of biological research. However, to locate a causal mutation, the practice of crossing to a polymorphic background to generate a mapping population can be problematic if the mutant phenotype is difficult to recognize in the hybrid F2 progeny, or dependent on parental specific traits. Here in a screen for leaf hyponasty mutants, we have performed a single backcross of an Ethane Methyl Sulphonate (EMS) generated hyponastic mutant to its parent. Whole genome deep sequencing of a bulked homozygous F2 population and analysis via the Next Generation EMS mutation mapping pipeline (NGM) unambiguously determined the causal mutation to be a single nucleotide polymorphisim (SNP) residing in HASTY, a previously characterized gene involved in microRNA biogenesis. We have evaluated the feasibility of this backcross approach using three additional SNP mapping pipelines; SHOREmap, the GATK pipeline, and the samtools pipeline. Although there was variance in the identification of EMS SNPs, all returned the same outcome in clearly identifying the causal mutation in HASTY. The simplicity of performing a single parental backcross and genome sequencing a small pool of segregating mutants has great promise for identifying mutations that may be difficult to map using conventional approaches.
Resumo:
Sorghum is a food and feed cereal crop adapted to heat and drought and a staple for 500 million of the world’s poorest people. Its small diploid genome and phenotypic diversity make it an ideal C4 grass model as a complement to C3 rice. Here we present high coverage (16–45 × ) resequenced genomes of 44 sorghum lines representing the primary gene pool and spanning dimensions of geographic origin, end-use and taxonomic group. We also report the first resequenced genome of S. propinquum, identifying 8 M high-quality SNPs, 1.9 M indels and specific gene loss and gain events in S. bicolor. We observe strong racial structure and a complex domestication history involving at least two distinct domestication events. These assembled genomes enable the leveraging of existing cereal functional genomics data against the novel diversity available in sorghum, providing an unmatched resource for the genetic improvement of sorghum and other grass species.
Resumo:
Sorghum is a food and feed cereal crop adapted to heat and drought and a staple for 500 million of the world’s poorest people. Its small diploid genome and phenotypic diversity make it an ideal C4 grass model as a complement to C3 rice. Here we present high coverage (16-45 × ) resequenced genomes of 44 sorghum lines representing the primary gene pool and spanning dimensions of geographic origin, end-use and taxonomic group. We also report the first resequenced genome of S. propinquum, identifying 8 M high-quality SNPs, 1.9 M indels and specific gene loss and gain events in S. bicolor. We observe strong racial structure and a complex domestication history involving at least two distinct domestication events. These assembled genomes enable the leveraging of existing cereal functional genomics data against the novel diversity available in sorghum, providing an unmatched resource for the genetic improvement of sorghum and other grass species.
Resumo:
The extent to which low-frequency (minor allele frequency (MAF) between 1-5%) and rare (MAF = 1%) variants contribute to complex traits and disease in the general population is mainly unknown. Bone mineral density (BMD) is highly heritable, a major predictor of osteoporotic fractures, and has been previously associated with common genetic variants, as well as rare, population-specific, coding variants. Here we identify novel non-coding genetic variants with large effects on BMD (ntotal = 53,236) and fracture (ntotal = 508,253) in individuals of European ancestry from the general population. Associations for BMD were derived from whole-genome sequencing (n = 2,882 from UK10K (ref. 10); a population-based genome sequencing consortium), whole-exome sequencing (n = 3,549), deep imputation of genotyped samples using a combined UK10K/1000 Genomes reference panel (n = 26,534), and de novo replication genotyping (n = 20,271). We identified a low-frequency non-coding variant near a novel locus, EN1, with an effect size fourfold larger than the mean of previously reported common variants for lumbar spine BMD (rs11692564(T), MAF = 1.6%, replication effect size = +0.20 s.d., Pmeta = 2 x 10(-14)), which was also associated with a decreased risk of fracture (odds ratio = 0.85; P = 2 x 10(-11); ncases = 98,742 and ncontrols = 409,511). Using an En1(cre/flox) mouse model, we observed that conditional loss of En1 results in low bone mass, probably as a consequence of high bone turnover. We also identified a novel low-frequency non-coding variant with large effects on BMD near WNT16 (rs148771817(T), MAF = 1.2%, replication effect size = +0.41 s.d., Pmeta = 1 x 10(-11)). In general, there was an excess of association signals arising from deleterious coding and conserved non-coding variants. These findings provide evidence that low-frequency non-coding variants have large effects on BMD and fracture, thereby providing rationale for whole-genome sequencing and improved imputation reference panels to study the genetic architecture of complex traits and disease in the general population.
Resumo:
Limited data are available regarding the molecular epidemiology of Mycobacterium tuberculosis (Mtb) strains circulating in Guatemala. Beijing-lineage Mtb strains have gained prevalence worldwide and are associated with increased virulence and drug resistance, but there have been only a few cases reported in Central America. Here we report the first whole genome sequencing of Central American Beijing-lineage strains of Mtb. We find that multiple Beijing-lineage strains, derived from independent founding events, are currently circulating in Guatemala, but overall still represent a relatively small proportion of disease burden. Finally, we identify a specific Beijing-lineage outbreak centered on a poor neighborhood in Guatemala City.
Resumo:
Whole genome sequencing (WGS) technology holds great promise as a tool for the forensic epidemiology of bacterial pathogens. It is likely to be particularly useful for studying the transmission dynamics of an observed epidemic involving a largely unsampled 'reservoir' host, as for bovine tuberculosis (bTB) in British and Irish cattle and badgers. BTB is caused by Mycobacterium bovis, a member of the M. tuberculosis complex that also includes the aetiological agent for human TB. In this study, we identified a spatio-temporally linked group of 26 cattle and 4 badgers infected with the same Variable Number Tandem Repeat (VNTR) type of M. bovis. Single-nucleotide polymorphisms (SNPs) between sequences identified differences that were consistent with bacterial lineages being persistent on or near farms for several years, despite multiple clear whole herd tests in the interim. Comparing WGS data to mathematical models showed good correlations between genetic divergence and spatial distance, but poor correspondence to the network of cattle movements or within-herd contacts. Badger isolates showed between zero and four SNP differences from the nearest cattle isolate, providing evidence for recent transmissions between the two hosts. This is the first direct genetic evidence of M. bovis persistence on farms over multiple outbreaks with a continued, ongoing interaction with local badgers. However, despite unprecedented resolution, directionality of transmission cannot be inferred at this stage. Despite the often notoriously long timescales between time of infection and time of sampling for TB, our results suggest that WGS data alone can provide insights into TB epidemiology even where detailed contact data are not available, and that more extensive sampling and analysis will allow for quantification of the extent and direction of transmission between cattle and badgers. © 2012 Biek et al.
Resumo:
Mycobacterium bovis is the causal agent of bovine tuberculosis, one of the most important diseases currently facing the UK cattle industry. Here, we use high-density whole genome sequencing (WGS) in a defined sub-population of M. bovis in 145 cattle across 66 herd breakdowns to gain insights into local spread and persistence. We show that despite low divergence among isolates, WGS can in principle expose contributions of under-sampled host populations to M. bovis transmission. However, we demonstrate that in our data such a signal is due to molecular type switching, which had been previously undocumented for M. bovis. Isolates from farms with a known history of direct cattle movement between them did not show a statistical signal of higher genetic similarity. Despite an overall signal of genetic isolation by distance, genetic distances also showed no apparent relationship with spatial distance among affected farms over distances <5 km. Using simulations, we find that even over the brief evolutionary timescale covered by our data, Bayesian phylogeographic approaches are feasible. Applying such approaches showed that M. bovis dispersal in this system is heterogeneous but slow overall, averaging 2 km/year. These results confirm that widespread application of WGS to M. bovis will bring novel and important insights into the dynamics of M. bovis spread and persistence, but that the current questions most pertinent to control will be best addressed using approaches that more directly integrate WGS with additional epidemiological data.
Resumo:
Whole-genome sequencing (WGS) could potentially provide a single platform for extracting all the information required to predict an organism’s phenotype. However, its ability to provide accurate predictions has not yet been demonstrated in large independent studies of specific organisms. In this study, we aimed to develop a genotypic prediction method for antimicrobial susceptibilities. The whole genomes of 501 unrelated Staphylococcus aureus isolates were sequenced, and the assembled genomes were interrogated using BLASTn for a panel of known resistance determinants (chromosomal mutations and genes carried on plasmids). Results were compared with phenotypic susceptibility testing for 12 commonly used antimicrobial agents (penicillin, methicillin, erythromycin, clindamycin, tetracycline, ciprofloxacin, vancomycin, trimethoprim, gentamicin, fusidic acid, rifampin, and mupirocin) performed by the routine clinical laboratory. We investigated discrepancies by repeat susceptibility testing and manual inspection of the sequences and used this information to optimize the resistance determinant panel and BLASTn algorithm. We then tested performance of the optimized tool in an independent validation set of 491 unrelated isolates, with phenotypic results obtained in duplicate by automated broth dilution (BD Phoenix) and disc diffusion. In the validation set, the overall sensitivity and specificity of the genomic prediction method were 0.97 (95% confidence interval [95% CI], 0.95 to 0.98) and 0.99 (95% CI, 0.99 to 1), respectively, compared to standard susceptibility testing methods. The very major error rate was 0.5%, and the major error rate was 0.7%. WGS was as sensitive and specific as routine antimicrobial susceptibility testing methods. WGS is a promising alternative to culture methods for resistance prediction in S. aureus and ultimately other major bacterial pathogens.
Resumo:
Background: Malaria caused by Plasmodium vivax is an experimentally neglected severe disease with a substantial burden on human health. Because of technical limitations, little is known about the biology of this important human pathogen. Whole genome analysis methods on patient-derived material are thus likely to have a substantial impact on our understanding of P. vivax pathogenesis and epidemiology. For example, it will allow study of the evolution and population biology of the parasite, allow parasite transmission patterns to be characterized, and may facilitate the identification of new drug resistance genes. Because parasitemias are typically low and the parasite cannot be readily cultured, on-site leukocyte depletion of blood samples is typically needed to remove human DNA that may be 1000X more abundant than parasite DNA. These features have precluded the analysis of archived blood samples and require the presence of laboratories in close proximity to the collection of field samples for optimal pre-cryopreservation sample preparation. Results: Here we show that in-solution hybridization capture can be used to extract P. vivax DNA from human contaminating DNA in the laboratory without the need for on-site leukocyte filtration. Using a whole genome capture method, we were able to enrich P. vivax DNA from bulk genomic DNA from less than 0.5% to a median of 55% (range 20%-80%). This level of enrichment allows for efficient analysis of the samples by whole genome sequencing and does not introduce any gross biases into the data. With this method, we obtained greater than 5X coverage across 93% of the P. vivax genome for four P. vivax strains from Iquitos, Peru, which is similar to our results using leukocyte filtration (greater than 5X coverage across 96% of the genome). Conclusion: The whole genome capture technique will enable more efficient whole genome analysis of P. vivax from a larger geographic region and from valuable archived sample collections.
Resumo:
The Saccharomyces cerevisiae strains widely used for industrial fuel-ethanol production have been developed by selection, but their underlying beneficial genetic polymorphisms remain unknown. Here, we report the draft whole-genome sequence of the S. cerevisiae strain CAT-1, which is a dominant fuel-ethanol fermentative strain from the sugarcane industry in Brazil. Our results indicate that strain CAT-1 is a highly heterozygous diploid yeast strain, and the similar to 12-Mb genome of CAT-1, when compared with the reference S228c genome, contains similar to 36,000 homozygous and similar to 30,000 heterozygous single nucleotide polymorphisms, exhibiting an uneven distribution among chromosomes due to large genomic regions of loss of heterozygosity (LOH). In total, 58 % of the 6,652 predicted protein-coding genes of the CAT-1 genome constitute different alleles when compared with the genes present in the reference S288c genome. The CAT-1 genome contains a reduced number of transposable elements, as well as several gene deletions and duplications, especially at telomeric regions, some correlated with several of the physiological characteristics of this industrial fuel-ethanol strain. Phylogenetic analyses revealed that some genes were likely associated with traits important for bioethanol production. Identifying and characterizing the allelic variations controlling traits relevant to industrial fermentation should provide the basis for a forward genetics approach for developing better fermenting yeast strains.
Resumo:
BACKGROUND Whole genome sequencing (WGS) is increasingly used in molecular-epidemiological investigations of bacterial pathogens, despite cost- and time-intensive analyses. We combined strain-specific single nucleotide polymorphism (SNP)-typing and targeted WGS to investigate a tuberculosis cluster spanning 21 years in Bern, Switzerland. METHODS Based on genome sequences of three historical outbreak Mycobacterium tuberculosis isolates, we developed a strain-specific SNP-typing assay to identify further cases. We screened 1,642 patient isolates, and performed WGS on all identified cluster isolates. We extracted SNPs to construct genomic networks. Clinical and social data were retrospectively collected. RESULTS We identified 68 patients associated with the outbreak strain. Most were diagnosed in 1991-1995, but cases were observed until 2011. Two thirds belonged to the homeless and substance abuser milieu. Targeted WGS revealed 133 variable SNP positions among outbreak isolates. Genomic network analyses suggested a single origin of the outbreak, with subsequent division into three sub-clusters. Isolates from patients with confirmed epidemiological links differed by 0-11 SNPs. CONCLUSIONS Strain-specific SNP-genotyping allowed rapid and inexpensive identification of M. tuberculosis outbreak isolates in a population-based strain collection. Subsequent targeted WGS provided detailed insights into transmission dynamics. This combined approach could be applied to track bacterial pathogens in real-time and at high resolution.
Resumo:
Over 250 Mendelian traits and disorders, caused by rare alleles have been mapped in the canine genome. Although each disease is rare in the dog as a species, they are collectively common and have major impact on canine health. With SNP-based genotyping arrays, genome-wide association studies (GWAS) have proven to be a powerful method to map the genomic region of interest when 10-20 cases and 10-20 controls are available. However, to identify the genetic variant in associated regions, fine-mapping and targeted re-sequencing is required. Here we present a new approach using whole-genome sequencing (WGS) of a family trio without prior GWAS. As a proof-of-concept, we chose an autosomal recessive disease known as hereditary footpad hyperkeratosis (HFH) in Kromfohrl änder dogs. To our knowledge, this is the first time this family trio WGS-approach, has successfully been used to identify a genetic variant that perfectly segregates with a canine disorder. The sequencing of three Kromfohrl änder dogs from a family trio (an affected offspring and both its healthy parents) resulted in an average genome coverage of 9.2X per individual. After applying stringent filtering criteria for candidate causative coding variants, 527 single nucleotide variants (SNVs) and 15 indels were found to be homozygous in the affected offspring and heterozygous in the parents. Using the computer software packages ANNOVAR and SIFT to functionally annotate coding sequence differences and to predict their functional effect, resulted in seven candidate variants located in six different genes. Of these, only FAM83G:c155G>C (p.R52P) was found to be concordant in eight additional cases and 16 healthy Kromfohrl änder dogs.