846 resultados para Whole genome sequencing
Resumo:
Thesis (Ph.D.)--University of Washington, 2016-06
Resumo:
The SOX family of transcription factors are found throughout the animal kingdom and are important in a variety of developmental contexts. Genome analysis has identified 20 Sox genes in human and mouse, which can be subdivided into 8 groups, based on sequence comparison and intron-exon structure. Most of the SOX groups identified in mammals are represented by a single SOX sequence in invertebrate model organisms, suggesting a duplication and divergence mechanism has operated during vertebrate evolution. We have now analysed the Sox gene complement in the pufferfish, Fugu rubripes, in order to shed further light on the diversity and origins of the Sox gene family. Major differences were found between the Sox family in Fugu and those in humans and mice. In particular, Fugu does not have orthologues of Sry, Sox,15 and Sox30, which appear to be specific to mammals, while Sox19, found in Fugu and zebrafish but absent in mammals, seems to be specific to fishes. Six mammalian Sox genes are represented by two copies each in Fugu, indicating a large-scale gene duplication in the fish lineage. These findings point to recent Sox gene loss, duplication and divergence occurring during the evolution of tetrapod and teleost lineages, and provide further evidence for large-scale segmental or a whole-genome duplication occurring early in the radiation of teleosts. (C) 2004 Elsevier B.V. All rights reserved.
Resumo:
The vacuolar H(+)-ATPase (V-ATPase), a multisubunit, adenosine triphosphate (ATP)-driven proton pump, is essential for numerous cellular processes in all eukaryotes investigated so far. While structure and catalytic mechanism are similar to the evolutionarily related F-type ATPases, the V-ATPase's main function is to establish an electrochemical proton potential across membranes using ATP hydrolysis. The holoenzyme is formed by two subcomplexes, the transmembraneous V(0) and the cytoplasmic V(1) complexes. Sequencing of the whole genome of the ciliate Paramecium tetraurelia enabled the identification of virtually all the genes encoding V-ATPase subunits in this organism and the studying of the localization of the enzyme and roles in membrane trafficking and osmoregulation. Surprisingly, the number of V-ATPase genes in this free-living protozoan is strikingly higher than in any other species previously studied. Especially abundant are V(0)-a-subunits with as many as 17 encoding genes. This abundance creates the possibility of forming a large number of different V-ATPase holoenzymes by combination and has functional consequences by differential targeting to various organelles.
Resumo:
The Arabidopsis root apical meristem (RAM) is a complex tissue capable of generating all the cell types that ultimately make up the root. The work presented in this thesis takes advantage of the versatility of high-throughput sequencing to address two independent questions about the root meristem. Although a lot of information is known regarding the cell fate decisions that occur at the RAM, cortex specification and differentiation remain poorly understood. In the first part of this thesis, I used an ethylmethanesulfonate (EMS) mutagenized marker line to perform a forward genetics screen. The goal of this screen was to identify novel genes involved in the specification and differentiation of the cortex tissue. Mapping analysis from the results obtained in this screen revealed a new allele of BRASSINOSTEROID4 with abnormal marker expression in the cortex tissue. Although this allele proved to be non-cortex specific, this project highlights new technology that allows mapping of EMS-generated mutations without the need to map-cross or back-cross. In the second part of this thesis, using fluorescence activated cell sorting (FACS) coupled with high throughput sequencing, my collaborators and I generated single-base resolution whole genome DNA methylomes, mRNA transcriptomes, and smallRNA transcriptomes for six different populations of cell types in the Arabidopsis root meristem. We were able to discover that the columella is hypermethylated in the CHH context within transposable elements. This hypermethylation is accompanied by upregulation of the RNA-dependent DNA methylation pathway (RdDM), including higher levels of 24-nt silencing RNAs (siRNAs). In summary, our studies demonstrate the versatility of high-throughput sequencing as a method for identifying single mutations or to perform complex comparative genomic analyses.
Resumo:
Email exchange in 2013 between Kathryn Maxson (Duke) and Kris Wetterstrand (NHGRI), regarding country funding and other data for the HGP sequencing centers. Also includes the email request for such information, from NHGRI to the centers, in 2000, and the aggregate data collected.
Resumo:
The use of whole-genome phylogenetic analysis has revolutionized our understanding of the evolution and spread of many important bacterial pathogens due to the high resolution view it provides. However, the majority of such analyses do not consider the potential role of accessory genes when inferring evolutionary trajectories. Moreover, the recently discovered importance of the switching of gene regulatory elements suggests that an exhaustive analysis, combining information from core and accessory genes with regulatory elements could provide unparalleled detail of the evolution of a bacterial population. Here we demonstrate this principle by applying it to a worldwide multi-host sample of the important pathogenic E. coli lineage ST131. Our approach reveals the existence of multiple circulating subtypes of the major drug–resistant clade of ST131 and provides the first ever population level evidence of core genome substitutions in gene regulatory regions associated with the acquisition and maintenance of different accessory genome elements.
Resumo:
Diabetes is the leading cause of end stage renal disease. Despite evidence for a substantial heritability of diabetic kidney disease, efforts to identify genetic susceptibility variants have had limited success. We extended previous efforts in three dimensions, examining a more comprehensive set of genetic variants in larger numbers of subjects with type 1 diabetes characterized for a wider range of cross-sectional diabetic kidney disease phenotypes. In 2,843 subjects, we estimated that the heritability of diabetic kidney disease was 35% ( p=6x10-3 ). Genome-wide association analysis and replication in 12,540 individuals identified no single variants reaching stringent levels of significance and, despite excellent power, provided little independent confirmation of previously published associated variants. Whole exome sequencing in 997 subjects failed to identify any large-effect coding alleles of lower frequency influencing the risk of diabetic kidney disease. However, sets of alleles increasing body mass index ( p=2.2×10-5) and the risk of type 2 diabetes (p=6.1x10-4 ) were associated with the risk of diabetic kidney disease. We also found genome-wide genetic correlation between diabetic kidney disease and failure at smoking cessation ( p=1.1×10-4 ). Pathway analysis implicated ascorbate and aldarate metabolism ( p=9×10-6), and pentose and glucuronate interconversions ( p=3×10-6) in pathogenesis of diabetic kidney disease. These data provide further evidence for the role of genetic factors influencing diabetic kidney disease in those with type 1 diabetes and highlight some key pathways that may be responsible. Altogether these results reveal important biology behind the major cause of kidney disease.
Resumo:
Genome-wide association studies (GWAS) have identified several risk variants for late-onset Alzheimer's disease (LOAD)1, 2. These common variants have replicable but small effects on LOAD risk and generally do not have obvious functional effects. Low-frequency coding variants, not detected by GWAS, are predicted to include functional variants with larger effects on risk. To identify low-frequency coding variants with large effects on LOAD risk, we carried out whole-exome sequencing (WES) in 14 large LOAD families and follow-up analyses of the candidate variants in several large LOAD case–control data sets. A rare variant in PLD3 (phospholipase D3; Val232Met) segregated with disease status in two independent families and doubled risk for Alzheimer’s disease in seven independent case–control series with a total of more than 11,000 cases and controls of European descent. Gene-based burden analyses in 4,387 cases and controls of European descent and 302 African American cases and controls, with complete sequence data for PLD3, reveal that several variants in this gene increase risk for Alzheimer’s disease in both populations. PLD3 is highly expressed in brain regions that are vulnerable to Alzheimer’s disease pathology, including hippocampus and cortex, and is expressed at significantly lower levels in neurons from Alzheimer’s disease brains compared to control brains. Overexpression of PLD3 leads to a significant decrease in intracellular amyloid-β precursor protein (APP) and extracellular Aβ42 and Aβ40 (the 42- and 40-residue isoforms of the amyloid-β peptide), and knockdown of PLD3 leads to a significant increase in extracellular Aβ42 and Aβ40. Together, our genetic and functional data indicate that carriers of PLD3 coding variants have a twofold increased risk for LOAD and that PLD3 influences APP processing. This study provides an example of how densely affected families may help to identify rare variants with large effects on risk for disease or other complex traits.
Resumo:
Eukaryotic genomes contain repetitive DNA sequences. This includes simple repeats and more complex transposable elements (TEs). Many TEs reach high copy numbers in the host genome, owing to their amplification abilities by specific mechanisms. There is growing evidence that TEs contribute to gene transcriptional regulation. However, excess of TE activity may lead to reduced genome stability. Therefore, TEs are suppressed by the transcriptional gene silencing machinery via specific chromatin modifications. In contrary, effectiveness of the epigenetic silencing mechanisms imposes risk for TE survival in the host genome. Therefore, TEs may have evolved specific strategies for bypassing epigenetic control and allowing the emergence of new TE copies. Recent studies suggested that the epigenetic silencing can be, at least transiently, attenuated by heat stress in A. thaliana. Heat stress induced strong transcriptional activation of COPIA78 family LTR-retrotransposons named ONSEN, and even their transposition in mutants deficient in siRNA-biogenesis. ONSEN transcriptional activation was facilitated by the presence of heat responsive elements (HREs) within the long terminal repeats, which serve as a binding platform for the HEAT SHOCK FACTORs (HSFs). This thesis focused on the evolution of ONSEN heat responsiveness in Brassicaceae. By using whole-transcriptome sequencing approach, multiple Arabidopsis lyrata ONSENs with conserved heat response were found and together with ONSENs from other Brassicaceae were used to reconstruct the evolution of ONSEN HREs. This indicated ancestral situation with two, in palindrome organized, HSF binding motifs. In the genera Arabidopsis and Ballantinia, a local duplication of this locus increased number of HSF binding motifs to four, forming a high-efficiency HRE. In addition, whole transcriptome analysis revealed novel heat-responsive TE families COPIA20, COPIA37 and HATE. Notably, HATE represents so far unknown COPIA family which occurs in several Brassicaceae species but is absent in A. thaliana. Putative HREs were identified within the LTRs of COPIA20, COPIA37 and HATE of A. lyrata, and could be preliminarily validated by transcriptional analysis upon heat induction in subsequent survey of Brassicaeae species. Subsequent phylogenetic analysis indicated a repeated evolution of heat responsiveness within Brassicaceae COPIA LTR-retrotransposons. This indicates that acquisition of heat responsiveness may represent a successful strategy for survival of TEs within the host genome.
Resumo:
Insights into the genomic adaptive traits of Treponema pallidum, the causative bacterium of syphilis, have long been hampered due to the absence of in vitro culture models and the constraints associated with its propagation in rabbits. Here, we have bypassed the culture bottleneck by means of a targeted strategy never applied to uncultivable bacterial human pathogens to directly capture whole-genome T. pallidum data in the context of human infection. This strategy has unveiled a scenario of discreet T. pallidum interstrain single-nucleotide-polymorphism-based microevolution, contrasting with a rampant within-patient genetic heterogeneity mainly targeting multiple phase-variable loci and a major antigen-coding gene (tprK). TprK demonstrated remarkable variability and redundancy, intra- and interpatient, suggesting ongoing parallel adaptive diversification during human infection. Some bacterial functions (for example, flagella- and chemotaxis-associated) were systematically targeted by both inter- and intrastrain single nucleotide polymorphisms, as well as by ongoing within-patient phase variation events. Finally, patient-derived genomes possess mutations targeting a penicillin-binding protein coding gene (mrcA) that had never been reported, unveiling it as a candidate target to investigate the impact on the susceptibility to penicillin. Our findings decode the major genetic mechanisms by which T. pallidum promotes immune evasion and survival, and demonstrate the exceptional power of characterizing evolving pathogen subpopulations during human infection.
Resumo:
The Next Generation Sequencing (NGS) allows to sequence the whole genome of an organism, compared to Maxam and Gilbert and Sanger sequencing that only allow to sequence, hardly, a single gene. Removing the separation of DNA fragments by electrophoresis, and the development of techniques that let the parallelization (analysing simultaneously several DNA fragments) have been crucial for the improvements of this process. The new companies in this ambit, Roche and Illumina, bet for different protocols to achieve these goals. Illumina bets for the sequencing by synthesis (SBS), requiring the library preparation and the use of adapters. Likewise, Illumina has replaced Roche because its lower rate of misincorporation, making it ideal for studies of genetic variability, transcriptomic, epigenomic, and metagenomic, in which this study will focus. However, it is noteworthy that the last progress in sequencing is carried out by the third generation sequencing, using nanotechnology to design small sequencers that sequence the whole genome of an organism quickly and inexpensively. Moreover, they provide more reliable data than current systems because they sequence a single molecule, solving the problem of synchronisation. In this way, PacBio and Nanopore allow a great progress in diagnostic and personalized medicine. Metagenomics provide to make a qualitative and quantitative analysis of the various species present in a sample. The main advantage of this technique is the no necessary isolation and growth of the species, allowing the analysis of nonculturable species. The Illumina protocol studies the variable regions of the 16S rRNA gene, which contains variable and not variables regions providing a phylogenetic classification. Therefore, metagenomics is a topic of interest to know the biodiversity of complex ecosystems and to study the microbiome of patients given the high involvement with certain microbial profiles on the condition of certain metabolic diseases.
Resumo:
Members of the oomycete cause extensive losses in agriculture and widespread degradation in natural plant communities, being responsible for the death of thousands of trees every year. Two of the representative species are Phytophthora infestans, which causes late blight of potato, and Phytophthora cinnamomi, which causes chestnut ink disease, responsible for losses on sweet chestnut production in Europe. Genome sequencing efforts have been focused on the study of three species: P. infestans, P. sojae and P. ramorum. Phytophthora infestans has been developed as the model specie for the genus, possessing excellent genetic and genomics resources including genetic maps, BAC libraries, and EST sequences. Our research team is trying to sequence the genome of P. cinnamomi in order to gain a better understanding of this oomycete, to study changes in plant-pathogen relationships including those resulting from climate change and trying to decrease the pathogen’s impact on crops and plants in natural ecosystems worldwide. We present here a preliminary report of partially sequenced genomic DNA from P. cinnamomi encoding putative protein-coding sequences and tRNAs. Database analysis reveals the presence of genes conserved in oomycetes.
Resumo:
Dengue virus (DENV) infections represent a significant concern for public health worldwide, being considered as the most prevalent arthropod-borne virus regarding the number of reported cases. In this study, we report the complete genome sequencing of a DENV serotype 4 isolate, genotype II, obtained in the city of Manaus, directly from the serum sample, applying Ion Torrent sequencing technology. The use of a massive sequencing technology allowed the detection of two variable sites, one in the coding region for the viral envelope protein and the other in the nonstructural 1 coding region within viral populations.
Resumo:
Rhizobium freirei PRF 81 is employed in common bean commercial inoculants in Brazil, due to its outstanding efficiency in fixing nitrogen, competitiveness and tolerance to abiotic stresses. Among the environmental conditions faced by rhizobia in soils, acidity is perhaps the encountered most, especially in Brazil. So, we used proteomics based approaches to study the responses of PRF 81 to a low pH condition. R. freirei PRF 81 was grown in TY medium until exponential phase in two treatments: pH 6,8 and pH 4,8. Whole-cell proteins were extracted and separated by two-dimensional gel electrophoresis, using IPG-strips with pH range 4-7 and 12% polyacrilamide gels. The experiment was performed in triplicate. Protein spots were detected in the high-resolution digitized gel images and analyzed by Image Master 2D Platinum v 5.0 software. Relative volumes (%vol) of compared between the two conditions tested and were statistically evaluated (p ≤ 0.05). Even knowing that R. freirei PRF 81 can still grow in more acid conditions, pH 4.8 was chosen because didn´t affect significantly the bacterial growth kinetics, a factor that could compromise the analysis. Using a narrow pH range, the gel profiles displayed a better resolution and reprodutibility than using broader pH range. Spots were mostly concentrated between pH 5-7 and molecular masses between 17-95 kDa. From the six hundred well-defined spots analyzed, one hundred and sixty-three spots presented a significant change in % vol, indicating that the pH led to expressive changes in the proteome of R. freirei PRF 81. Of these, sixty-one were up-regulated and one hundred two was downregulated in pH 4.8 condition. Also, fourteen spots were only identified in the acid condition, while seven spots was exclusively detected in pH 6.8. Ninety-five differentially expressed spots and two exclusively detected in pH 4,8 were selected for Maldi-Tof identification. Together with the genome sequencing and the proteome analysis of heat stress, we will search for molecular determinants of PRF 81 related to capacity to adapt to stressful tropical conditions.
Resumo:
Gastric (GC) and breast (BrC) cancer are two of the most common and deadly tumours. Different lines of evidence suggest a possible causative role of viral infections for both GC and BrC. Wide genome sequencing (WGS) technologies allow searching for viral agents in tissues of patients with cancer. These technologies have already contributed to establish virus-cancer associations as well as to discovery new tumour viruses. The objective of this study was to document possible associations of viral infection with GC and BrC in Mexican patients. In order to gain idea about cost effective conditions of experimental sequencing, we first carried out an in silico simulation of WGS. The next-generation-platform IlluminaGallx was then used to sequence GC and BrC tumour samples. While we did not find viral sequences in tissues from BrC patients, multiple reads matching Epstein-Barr virus (EBV) sequences were found in GC tissues. An end-point polymerase chain reaction confirmed an enrichment of EBV sequences in one of the GC samples sequenced, validating the next-generation sequencing-bioinformatics pipeline.