26 resultados para Vibrio vulnificus, Genome sequencing, Hybrid assembly, Pathogenesis, Virulence factor, Hemolysin, Secretion system

em Duke University


100.00% 100.00%



BACKGROUND: The rate of emergence of human pathogens is steadily increasing; most of these novel agents originate in wildlife. Bats, remarkably, are the natural reservoirs of many of the most pathogenic viruses in humans. There are two bat genome projects currently underway, a circumstance that promises to speed the discovery host factors important in the coevolution of bats with their viruses. These genomes, however, are not yet assembled and one of them will provide only low coverage, making the inference of most genes of immunological interest error-prone. Many more wildlife genome projects are underway and intend to provide only shallow coverage. RESULTS: We have developed a statistical method for the assembly of gene families from partial genomes. The method takes full advantage of the quality scores generated by base-calling software, incorporating them into a complete probabilistic error model, to overcome the limitation inherent in the inference of gene family members from partial sequence information. We validated the method by inferring the human IFNA genes from the genome trace archives, and used it to infer 61 type-I interferon genes, and single type-II interferon genes in the bats Pteropus vampyrus and Myotis lucifugus. We confirmed our inferences by direct cloning and sequencing of IFNA, IFNB, IFND, and IFNK in P. vampyrus, and by demonstrating transcription of some of the inferred genes by known interferon-inducing stimuli. CONCLUSION: The statistical trace assembler described here provides a reliable method for extracting information from the many available and forthcoming partial or shallow genome sequencing projects, thereby facilitating the study of a wider variety of organisms with ecological and biomedical significance to humans than would otherwise be possible.


100.00% 100.00%



Limited data are available regarding the molecular epidemiology of Mycobacterium tuberculosis (Mtb) strains circulating in Guatemala. Beijing-lineage Mtb strains have gained prevalence worldwide and are associated with increased virulence and drug resistance, but there have been only a few cases reported in Central America. Here we report the first whole genome sequencing of Central American Beijing-lineage strains of Mtb. We find that multiple Beijing-lineage strains, derived from independent founding events, are currently circulating in Guatemala, but overall still represent a relatively small proportion of disease burden. Finally, we identify a specific Beijing-lineage outbreak centered on a poor neighborhood in Guatemala City.


100.00% 100.00%



A precise molecular identification of transmitted hepatitis C virus (HCV) genomes could illuminate key aspects of transmission biology, immunopathogenesis and natural history. We used single genome sequencing of 2,922 half or quarter genomes from plasma viral RNA to identify transmitted/founder (T/F) viruses in 17 subjects with acute community-acquired HCV infection. Sequences from 13 of 17 acute subjects, but none of 14 chronic controls, exhibited one or more discrete low diversity viral lineages. Sequences within each lineage generally revealed a star-like phylogeny of mutations that coalesced to unambiguous T/F viral genomes. Numbers of transmitted viruses leading to productive clinical infection were estimated to range from 1 to 37 or more (median = 4). Four acutely infected subjects showed a distinctly different pattern of virus diversity that deviated from a star-like phylogeny. In these cases, empirical analysis and mathematical modeling suggested high multiplicity virus transmission from individuals who themselves were acutely infected or had experienced a virus population bottleneck due to antiviral drug therapy. These results provide new quantitative and qualitative insights into HCV transmission, revealing for the first time virus-host interactions that successful vaccines or treatment interventions will need to overcome. Our findings further suggest a novel experimental strategy for identifying full-length T/F genomes for proteome-wide analyses of HCV biology and adaptation to antiviral drug or immune pressures.


100.00% 100.00%



Photographs from the February 1997 Bermuda meeting. Courtesy of Gert-Jan van Ommen.


100.00% 100.00%



UNLABELLED: The human fungal pathogen Cryptococcus neoformans is capable of infecting a broad range of hosts, from invertebrates like amoebas and nematodes to standard vertebrate models such as mice and rabbits. Here we have taken advantage of a zebrafish model to investigate host-pathogen interactions of Cryptococcus with the zebrafish innate immune system, which shares a highly conserved framework with that of mammals. Through live-imaging observations and genetic knockdown, we establish that macrophages are the primary immune cells responsible for responding to and containing acute cryptococcal infections. By interrogating survival and cryptococcal burden following infection with a panel of Cryptococcus mutants, we find that virulence factors initially identified as important in causing disease in mice are also necessary for pathogenesis in zebrafish larvae. Live imaging of the cranial blood vessels of infected larvae reveals that C. neoformans is able to penetrate the zebrafish brain following intravenous infection. By studying a C. neoformans FNX1 gene mutant, we find that blood-brain barrier invasion is dependent on a known cryptococcal invasion-promoting pathway previously identified in a murine model of central nervous system invasion. The zebrafish-C. neoformans platform provides a visually and genetically accessible vertebrate model system for cryptococcal pathogenesis with many of the advantages of small invertebrates. This model is well suited for higher-throughput screening of mutants, mechanistic dissection of cryptococcal pathogenesis in live animals, and use in the evaluation of therapeutic agents. IMPORTANCE: Cryptococcus neoformans is an important opportunistic pathogen that is estimated to be responsible for more than 600,000 deaths worldwide annually. Existing mammalian models of cryptococcal pathogenesis are costly, and the analysis of important pathogenic processes such as meningitis is laborious and remains a challenge to visualize. Conversely, although invertebrate models of cryptococcal infection allow high-throughput assays, they fail to replicate the anatomical complexity found in vertebrates and, specifically, cryptococcal stages of disease. Here we have utilized larval zebrafish as a platform that overcomes many of these limitations. We demonstrate that the pathogenesis of C. neoformans infection in zebrafish involves factors identical to those in mammalian and invertebrate infections. We then utilize the live-imaging capacity of zebrafish larvae to follow the progression of cryptococcal infection in real time and establish a relevant model of the critical central nervous system infection phase of disease in a nonmammalian model.


100.00% 100.00%



Single-molecule sequencing instruments can generate multikilobase sequences with the potential to greatly improve genome and transcriptome assembly. However, the error rates of single-molecule reads are high, which has limited their use thus far to resequencing bacteria. To address this limitation, we introduce a correction algorithm and assembly strategy that uses short, high-fidelity sequences to correct the error in single-molecule sequences. We demonstrate the utility of this approach on reads generated by a PacBio RS instrument from phage, prokaryotic and eukaryotic whole genomes, including the previously unsequenced genome of the parrot Melopsittacus undulatus, as well as for RNA-Seq reads of the corn (Zea mays) transcriptome. Our long-read correction achieves >99.9% base-call accuracy, leading to substantially better assemblies than current sequencing strategies: in the best example, the median contig size was quintupled relative to high-coverage, second-generation assemblies. Greater gains are predicted if read lengths continue to increase, including the prospect of single-contig bacterial chromosome assembly.


100.00% 100.00%



BACKGROUND: Parrots belong to a group of behaviorally advanced vertebrates and have an advanced ability of vocal learning relative to other vocal-learning birds. They can imitate human speech, synchronize their body movements to a rhythmic beat, and understand complex concepts of referential meaning to sounds. However, little is known about the genetics of these traits. Elucidating the genetic bases would require whole genome sequencing and a robust assembly of a parrot genome. FINDINGS: We present a genomic resource for the budgerigar, an Australian Parakeet (Melopsittacus undulatus) -- the most widely studied parrot species in neuroscience and behavior. We present genomic sequence data that includes over 300× raw read coverage from multiple sequencing technologies and chromosome optical maps from a single male animal. The reads and optical maps were used to create three hybrid assemblies representing some of the largest genomic scaffolds to date for a bird; two of which were annotated based on similarities to reference sets of non-redundant human, zebra finch and chicken proteins, and budgerigar transcriptome sequence assemblies. The sequence reads for this project were in part generated and used for both the Assemblathon 2 competition and the first de novo assembly of a giga-scale vertebrate genome utilizing PacBio single-molecule sequencing. CONCLUSIONS: Across several quality metrics, these budgerigar assemblies are comparable to or better than the chicken and zebra finch genome assemblies built from traditional Sanger sequencing reads, and are sufficient to analyze regions that are difficult to sequence and assemble, including those not yet assembled in prior bird genomes, and promoter regions of genes differentially regulated in vocal learning brain regions. This work provides valuable data and material for genome technology development and for investigating the genomics of complex behavioral traits.


100.00% 100.00%



BACKGROUND: There is considerable interest in the development of methods to efficiently identify all coding variants present in large sample sets of humans. There are three approaches possible: whole-genome sequencing, whole-exome sequencing using exon capture methods, and RNA-Seq. While whole-genome sequencing is the most complete, it remains sufficiently expensive that cost effective alternatives are important. RESULTS: Here we provide a systematic exploration of how well RNA-Seq can identify human coding variants by comparing variants identified through high coverage whole-genome sequencing to those identified by high coverage RNA-Seq in the same individual. This comparison allowed us to directly evaluate the sensitivity and specificity of RNA-Seq in identifying coding variants, and to evaluate how key parameters such as the degree of coverage and the expression levels of genes interact to influence performance. We find that although only 40% of exonic variants identified by whole genome sequencing were captured using RNA-Seq; this number rose to 81% when concentrating on genes known to be well-expressed in the source tissue. We also find that a high false positive rate can be problematic when working with RNA-Seq data, especially at higher levels of coverage. CONCLUSIONS: We conclude that as long as a tissue relevant to the trait under study is available and suitable quality control screens are implemented, RNA-Seq is a fast and inexpensive alternative approach for finding coding variants in genes with sufficiently high expression levels.


100.00% 100.00%



BACKGROUND: The evolutionary relationships of modern birds are among the most challenging to understand in systematic biology and have been debated for centuries. To address this challenge, we assembled or collected the genomes of 48 avian species spanning most orders of birds, including all Neognathae and two of the five Palaeognathae orders, and used the genomes to construct a genome-scale avian phylogenetic tree and perform comparative genomics analyses (Jarvis et al. in press; Zhang et al. in press). Here we release assemblies and datasets associated with the comparative genome analyses, which include 38 newly sequenced avian genomes plus previously released or simultaneously released genomes of Chicken, Zebra finch, Turkey, Pigeon, Peregrine falcon, Duck, Budgerigar, Adelie penguin, Emperor penguin and the Medium Ground Finch. We hope that this resource will serve future efforts in phylogenomics and comparative genomics. FINDINGS: The 38 bird genomes were sequenced using the Illumina HiSeq 2000 platform and assembled using a whole genome shotgun strategy. The 48 genomes were categorized into two groups according to the N50 scaffold size of the assemblies: a high depth group comprising 23 species sequenced at high coverage (>50X) with multiple insert size libraries resulting in N50 scaffold sizes greater than 1 Mb (except the White-throated Tinamou and Bald Eagle); and a low depth group comprising 25 species sequenced at a low coverage (~30X) with two insert size libraries resulting in an average N50 scaffold size of about 50 kb. Repetitive elements comprised 4%-22% of the bird genomes. The assembled scaffolds allowed the homology-based annotation of 13,000 ~ 17000 protein coding genes in each avian genome relative to chicken, zebra finch and human, as well as comparative and sequence conservation analyses. CONCLUSIONS: Here we release full genome assemblies of 38 newly sequenced avian species, link genome assembly downloads for the 7 of the remaining 10 species, and provide a guideline of genomic data that has been generated and used in our Avian Phylogenomics Project. To the best of our knowledge, the Avian Phylogenomics Project is the biggest vertebrate comparative genomics project to date. The genomic data presented here is expected to accelerate further analyses in many fields, including phylogenetics, comparative genomics, evolution, neurobiology, development biology, and other related areas.


100.00% 100.00%



Much of science progresses within the tight boundaries of what is often seen as a "black box". Though familiar to funding agencies, researchers and the academic journals they publish in, it is an entity that outsiders rarely get to peek into. Crowdfunding is a novel means that allows the public to participate in, as well as to support and witness advancements in science. Here we describe our recent crowdfunding efforts to sequence the Azolla genome, a little fern with massive green potential. Crowdfunding is a worthy platform not only for obtaining seed money for exploratory research, but also for engaging directly with the general public as a rewarding form of outreach.


100.00% 100.00%



Ferns are one of the few remaining major clades of land plants for which a complete genome sequence is lacking. Knowledge of genome space in ferns will enable broad-scale comparative analyses of land plant genes and genomes, provide insights into genome evolution across green plants, and shed light on genetic and genomic features that characterize ferns, such as their high chromosome numbers and large genome sizes. As part of an initial exploration into fern genome space, we used a whole genome shotgun sequencing approach to obtain low-density coverage (∼0.4X to 2X) for six fern species from the Polypodiales (Ceratopteris, Pteridium, Polypodium, Cystopteris), Cyatheales (Plagiogyria), and Gleicheniales (Dipteris). We explore these data to characterize the proportion of the nuclear genome represented by repetitive sequences (including DNA transposons, retrotransposons, ribosomal DNA, and simple repeats) and protein-coding genes, and to extract chloroplast and mitochondrial genome sequences. Such initial sweeps of fern genomes can provide information useful for selecting a promising candidate fern species for whole genome sequencing. We also describe variation of genomic traits across our sample and highlight some differences and similarities in repeat structure between ferns and seed plants.


100.00% 100.00%



Email exchange in 2013 between Kathryn Maxson (Duke) and Kris Wetterstrand (NHGRI), regarding country funding and other data for the HGP sequencing centers. Also includes the email request for such information, from NHGRI to the centers, in 2000, and the aggregate data collected.


100.00% 100.00%



The advent of next-generation sequencing, now nearing a decade in age, has enabled, among other capabilities, measurement of genome-wide sequence features at unprecedented scale and resolution.

In this dissertation, I describe work to understand the genetic underpinnings of non-Hodgkin’s lymphoma through exploration of the epigenetics of its cell of origin, initial characterization and interpretation of driver mutations, and finally, a larger-scale, population-level study that incorporates mutation interpretation with clinical outcome.

In the first research chapter, I describe genomic characteristics of lymphomas through the lens of their cells of origin. Just as many other cancers, such as breast cancer or lung cancer, are categorized based on their cell of origin, lymphoma subtypes can be examined through the context of their normal B Cells of origin, Naïve, Germinal Center, and post-Germinal Center. By applying integrative analysis of the epigenetics of normal B Cells of origin through chromatin-immunoprecipitation sequencing, we find that differences in normal B Cell subtypes are reflected in the mutational landscapes of the cancers that arise from them, namely Mantle Cell, Burkitt, and Diffuse Large B-Cell Lymphoma.

In the next research chapter, I describe our first endeavor into understanding the genetic heterogeneity of Diffuse Large B Cell Lymphoma, the most common form of non-Hodgkin’s lymphoma, which affects 100,000 patients in the world. Through whole-genome sequencing of 1 case as well as whole-exome sequencing of 94 cases, we characterize the most recurrent genetic features of DLBCL and lay the groundwork for a larger study.

In the last research chapter, I describe work to characterize and interpret the whole exomes of 1001 cases of DLBCL in the largest single-cancer study to date. This highly-powered study enabled sub-gene, gene-level, and gene-network level understanding of driver mutations within DLBCL. Moreover, matched genomic and clinical data enabled the connection of these driver mutations to clinical features such as treatment response or overall survival. As sequencing costs continue to drop, whole-exome sequencing will become a routine clinical assay, and another diagnostic dimension in addition to existing methods such as histology. However, to unlock the full utility of sequencing data, we must be able to interpret it. This study undertakes a first step in developing the understanding necessary to uncover the genomic signals of DLBCL hidden within its exomes. However, beyond the scope of this one disease, the experimental and analytical methods can be readily applied to other cancer sequencing studies.

Thus, this dissertation leverages next-generation sequencing analysis to understand the genetic underpinnings of lymphoma, both by examining its normal cells of origin as well as through a large-scale study to sensitively identify recurrently mutated genes and their relationship to clinical outcome.


100.00% 100.00%



BACKGROUND: Vesiculation is a ubiquitous secretion process of Gram-negative bacteria, where outer membrane vesicles (OMVs) are small spherical particles on the order of 50 to 250 nm composed of outer membrane (OM) and lumenal periplasmic content. Vesicle functions have been elucidated in some detail, showing their importance in virulence factor secretion, bacterial survival, and biofilm formation in pathogenesis. Furthermore, OMVs serve as an envelope stress response, protecting the secreting bacteria from internal protein misfolding stress, as well as external envelope stressors. Despite their important functional roles very little is known about the regulation and mechanism of vesicle production. Based on the envelope architecture and prior characterization of the hypervesiculation phenotypes for mutants lacking the lipoprotein, Lpp, which is involved in the covalent OM-peptidoglycan (PG) crosslinks, it is expected that an inverse relationship exists between OMV production and PG-crosslinked Lpp. RESULTS: In this study, we found that subtle modifications of PG remodeling and crosslinking modulate OMV production, inversely correlating with bound Lpp levels. However, this inverse relationship was not found in strains in which OMV production is driven by an increase in "periplasmic pressure" resulting from the accumulation of protein, PG fragments, or lipopolysaccharide. In addition, the characterization of an nlpA deletion in backgrounds lacking either Lpp- or OmpA-mediated envelope crosslinks demonstrated a novel role for NlpA in envelope architecture. CONCLUSIONS: From this work, we conclude that OMV production can be driven by distinct Lpp concentration-dependent and Lpp concentration-independent pathways.


100.00% 100.00%



Enterotoxigenic Escherichia coli (ETEC) is a significant source of morbidity and mortality worldwide. One major virulence factor released by ETEC is the heat-labile enterotoxin LT, which is structurally and functionally similar to cholera toxin. LT consists of five B subunits carrying a single catalytically active A subunit. LTB binds the monosialoganglioside G(M1), the toxin's host receptor, but interactions with A-type blood sugars and E. coli lipopolysaccharide have also been identified within the past decade. Here, we review the regulation, assembly, and binding properties of the LT B-subunit pentamer and discuss the possible roles of its numerous molecular interactions.