932 resultados para Klebsiella pneumoniae genome sequence
Resumo:
The publication of the human genome sequence in 2001 was a major step forward in knowledge necessary to understand the variations between individuals. For farmed species, genomic sequence information will facilitate the selection of animals optimised to live, and be productive, in particular environments. The availability of cattle genome sequence has allowed the breeding industry to take the first steps towards predicting phenotypes from genotypes by estimating a genomic breeding value (gEBV) for bulls using genome-wide DNA markers. The sequencing of the buffalo genome and creation of a panel of DNA markers has created the opportunity to apply molecular selection approaches for this species.The genomes of several buffalo of different breeds were sequenced and aligned with the bovine genome, which facilitated the identification of millions of sequence variants in the buffalo genomes. Based on frequencies of variants within and among buffalo breeds, and their distribution across the genome compared with the bovine genome, 90,000 putative single nucleotide polymorphisms (SNP) were selected to create an Axiom (R) Buffalo Genotyping Array 90K. This SNP Chip was tested in buffalo populations from Italy and Brazil and found to have at least 75% high quality and polymorphic markers in these populations. The 90K SNP chip was then used to investigate the structure of buffalo populations, and to localise the variations having a major effect on milk production.
Resumo:
Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)
Resumo:
The Saccharomyces cerevisiae strains widely used for industrial fuel-ethanol production have been developed by selection, but their underlying beneficial genetic polymorphisms remain unknown. Here, we report the draft whole-genome sequence of the S. cerevisiae strain CAT-1, which is a dominant fuel-ethanol fermentative strain from the sugarcane industry in Brazil. Our results indicate that strain CAT-1 is a highly heterozygous diploid yeast strain, and the similar to 12-Mb genome of CAT-1, when compared with the reference S228c genome, contains similar to 36,000 homozygous and similar to 30,000 heterozygous single nucleotide polymorphisms, exhibiting an uneven distribution among chromosomes due to large genomic regions of loss of heterozygosity (LOH). In total, 58 % of the 6,652 predicted protein-coding genes of the CAT-1 genome constitute different alleles when compared with the genes present in the reference S288c genome. The CAT-1 genome contains a reduced number of transposable elements, as well as several gene deletions and duplications, especially at telomeric regions, some correlated with several of the physiological characteristics of this industrial fuel-ethanol strain. Phylogenetic analyses revealed that some genes were likely associated with traits important for bioethanol production. Identifying and characterizing the allelic variations controlling traits relevant to industrial fermentation should provide the basis for a forward genetics approach for developing better fermenting yeast strains.
Resumo:
Intron splicing is one of the most important steps involved in the maturation process of a pre-mRNA. Although the sequence profiles around the splice sites have been studied extensively, the levels of sequence identity between the exonic sequences preceding the donor sites and the intronic sequences preceding the acceptor sites has not been examined as thoroughly. In this study we investigated identity patterns between the last 15 nucleotides of the exonic sequence preceding the 5' splice site and the intronic sequence preceding the 3' splice site in a set of human protein-coding genes that do not exhibit intron retention. We found that almost 60% of consecutive exons and introns in human protein-coding genes share at least two identical nucleotides at their 3' ends and, on average, the sequence identity length is 2.47 nucleotides. Based on our findings we conclude that the 3' ends of exons and introns tend to have longer identical sequences within a gene than when being taken from different genes. Our results hold even if the pairs are non-consecutive in the transcription order. (C) 2012 Elsevier Ltd. All rights reserved.
Resumo:
Abstract Background Citrus canker is a disease that has severe economic impact on the citrus industry worldwide. There are three types of canker, called A, B, and C. The three types have different phenotypes and affect different citrus species. The causative agent for type A is Xanthomonas citri subsp. citri, whose genome sequence was made available in 2002. Xanthomonas fuscans subsp. aurantifolii strain B causes canker B and Xanthomonas fuscans subsp. aurantifolii strain C causes canker C. Results We have sequenced the genomes of strains B and C to draft status. We have compared their genomic content to X. citri subsp. citri and to other Xanthomonas genomes, with special emphasis on type III secreted effector repertoires. In addition to pthA, already known to be present in all three citrus canker strains, two additional effector genes, xopE3 and xopAI, are also present in all three strains and are both located on the same putative genomic island. These two effector genes, along with one other effector-like gene in the same region, are thus good candidates for being pathogenicity factors on citrus. Numerous gene content differences also exist between the three cankers strains, which can be correlated with their different virulence and host range. Particular attention was placed on the analysis of genes involved in biofilm formation and quorum sensing, type IV secretion, flagellum synthesis and motility, lipopolysacharide synthesis, and on the gene xacPNP, which codes for a natriuretic protein. Conclusion We have uncovered numerous commonalities and differences in gene content between the genomes of the pathogenic agents causing citrus canker A, B, and C and other Xanthomonas genomes. Molecular genetics can now be employed to determine the role of these genes in plant-microbe interactions. The gained knowledge will be instrumental for improving citrus canker control.
Resumo:
Background: Even before having its genome sequence published in 2004, Kluyveromyces lactis had long been considered a model organism for studies in genetics and physiology. Research on Kluyveromyces lactis is quite advanced and this yeast species is one of the few with which it is possible to perform formal genetic analysis. Nevertheless, until now, no complete metabolic functional annotation has been performed to the proteins encoded in the Kluyveromyces lactis genome. Results: In this work, a new metabolic genome-wide functional re-annotation of the proteins encoded in the Kluyveromyces lactis genome was performed, resulting in the annotation of 1759 genes with metabolic functions, and the development of a methodology supported by merlin (software developed in-house). The new annotation includes novelties, such as the assignment of transporter superfamily numbers to genes identified as transporter proteins. Thus, the genes annotated with metabolic functions could be exclusively enzymatic (1410 genes), transporter proteins encoding genes (301 genes) or have both metabolic activities (48 genes). The new annotation produced by this work largely surpassed the Kluyveromyces lactis currently available annotations. A comparison with KEGG’s annotation revealed a match with 844 (~90%) of the genes annotated by KEGG, while adding 850 new gene annotations. Moreover, there are 32 genes with annotations different from KEGG. Conclusions: The methodology developed throughout this work can be used to re-annotate any yeast or, with a little tweak of the reference organism, the proteins encoded in any sequenced genome. The new annotation provided by this study offers basic knowledge which might be useful for the scientific community working on this model yeast, because new functions have been identified for the so-called metabolic genes. Furthermore, it served as the basis for the reconstruction of a compartmentalized, genome-scale metabolic model of Kluyveromyces lactis, which is currently being finished.
Resumo:
A multilocus sequence typing (MLST) scheme was established and evaluated for Mycoplasma hyopneumoniae, the etiologic agent of enzootic pneumonia in swine with the aim of defining strains. Putative target genes were selected by genome sequence comparisons. Out of 12 housekeeping genes chosen and experimentally validated, the 7 genes efp, metG, pgiB, recA, adk, rpoB, and tpiA were finally used to establish the MLST scheme. Their usefulness was assessed individually and in combination using a set of well-defined field samples and strains of M. hyopneumoniae. A reduction to the three targets showing highest variation (adk, rpoB, and tpiA) was possible resulting in the same number of sequence types as using the seven targets. The established MLST approach was compared with the recently described typing method using the serine-rich repeat motif-encoding region of the p146 gene. There was coherence between the two methods, but MLST resulted in a slightly higher resolution. Farms recognized to be affected by enzootic pneumonia were always associated with a single M. hyopneumoniae clone, which in most cases differed from farm to farm. However, farms in close geographic or operational contact showed identical clones as defined by MLST typing. Population analysis showed that recombination in M. hyopneumoniae occurs and that strains are very diverse with only limited clonality observed. Elaborate classical MLST schemes using multiple targets for M. hyopneumoniae might therefore be of limited value. In contrast, MLST typing of M. hyopneumoniae using the three genes adk, rpoB, and tpiA seems to be sufficient for epidemiological investigations by direct amplification of target genes from lysate of clinical material without prior cultivation.
Resumo:
The macronuclear genome of the ciliate Oxytricha trifallax displays an extreme and unique eukaryotic genome architecture with extensive genomic variation. During sexual genome development, the expressed, somatic macronuclear genome is whittled down to the genic portion of a small fraction (∼5%) of its precursor "silent" germline micronuclear genome by a process of "unscrambling" and fragmentation. The tiny macronuclear "nanochromosomes" typically encode single, protein-coding genes (a small portion, 10%, encode 2-8 genes), have minimal noncoding regions, and are differentially amplified to an average of ∼2,000 copies. We report the high-quality genome assembly of ∼16,000 complete nanochromosomes (∼50 Mb haploid genome size) that vary from 469 bp to 66 kb long (mean ∼3.2 kb) and encode ∼18,500 genes. Alternative DNA fragmentation processes ∼10% of the nanochromosomes into multiple isoforms that usually encode complete genes. Nucleotide diversity in the macronucleus is very high (SNP heterozygosity is ∼4.0%), suggesting that Oxytricha trifallax may have one of the largest known effective population sizes of eukaryotes. Comparison to other ciliates with nonscrambled genomes and long macronuclear chromosomes (on the order of 100 kb) suggests several candidate proteins that could be involved in genome rearrangement, including domesticated MULE and IS1595-like DDE transposases. The assembly of the highly fragmented Oxytricha macronuclear genome is the first completed genome with such an unusual architecture. This genome sequence provides tantalizing glimpses into novel molecular biology and evolution. For example, Oxytricha maintains tens of millions of telomeres per cell and has also evolved an intriguing expansion of telomere end-binding proteins. In conjunction with the micronuclear genome in progress, the O. trifallax macronuclear genome will provide an invaluable resource for investigating programmed genome rearrangements, complementing studies of rearrangements arising during evolution and disease.
Resumo:
A comprehensive second-generation whole genome radiation hybrid (RH II), cytogenetic and comparative map of the horse genome (2n = 64) has been developed using the 5000rad horse x hamster radiation hybrid panel and fluorescence in situ hybridization (FISH). The map contains 4,103 markers (3,816 RH; 1,144 FISH) assigned to all 31 pairs of autosomes and the X chromosome. The RH maps of individual chromosomes are anchored and oriented using 857 cytogenetic markers. The overall resolution of the map is one marker per 775 kilobase pairs (kb), which represents a more than five-fold improvement over the first-generation map. The RH II incorporates 920 markers shared jointly with the two recently reported meiotic maps. Consequently the two maps were aligned with the RH II maps of individual autosomes and the X chromosome. Additionally, a comparative map of the horse genome was generated by connecting 1,904 loci on the horse map with genome sequences available for eight diverse vertebrates to highlight regions of evolutionarily conserved syntenies, linkages, and chromosomal breakpoints. The integrated map thus obtained presents the most comprehensive information on the physical and comparative organization of the equine genome and will assist future assemblies of whole genome BAC fingerprint maps and the genome sequence. It will also serve as a tool to identify genes governing health, disease and performance traits in horses and assist us in understanding the evolution of the equine genome in relation to other species.
Resumo:
BACKGROUND A cost-effective strategy to increase the density of available markers within a population is to sequence a small proportion of the population and impute whole-genome sequence data for the remaining population. Increased densities of typed markers are advantageous for genome-wide association studies (GWAS) and genomic predictions. METHODS We obtained genotypes for 54 602 SNPs (single nucleotide polymorphisms) in 1077 Franches-Montagnes (FM) horses and Illumina paired-end whole-genome sequencing data for 30 FM horses and 14 Warmblood horses. After variant calling, the sequence-derived SNP genotypes (~13 million SNPs) were used for genotype imputation with the software programs Beagle, Impute2 and FImpute. RESULTS The mean imputation accuracy of FM horses using Impute2 was 92.0%. Imputation accuracy using Beagle and FImpute was 74.3% and 77.2%, respectively. In addition, for Impute2 we determined the imputation accuracy of all individual horses in the validation population, which ranged from 85.7% to 99.8%. The subsequent inclusion of Warmblood sequence data further increased the correlation between true and imputed genotypes for most horses, especially for horses with a high level of admixture. The final imputation accuracy of the horses ranged from 91.2% to 99.5%. CONCLUSIONS Using Impute2, the imputation accuracy was higher than 91% for all horses in the validation population, which indicates that direct imputation of 50k SNP-chip data to sequence level genotypes is feasible in the FM population. The individual imputation accuracy depended mainly on the applied software and the level of admixture.
Resumo:
Puumala virus (PUUV) is one of the predominant hantavirus species in Europe causing mild to moderate cases of haemorrhagic fever with renal syndrome. Parts of Lower Saxony in north-western Germany are endemic for PUUV infections. In this study, the complete PUUV genome sequence of a bank vole-derived tissue sample from the 2007 outbreak was determined by a combined primer-walking and RNA ligation strategy. The S, M and L genome segments were 1,828, 3,680 and 6,550 nucleotides in length, respectively. Sliding-window analyses of the nucleotide sequences of all available complete PUUV genomes indicated a non-homogenous distribution of variability with hypervariable regions located at the 3′-ends of the S and M segments. The overall similarity of the coding genome regions to the other PUUV strains ranged between 80.1 and 84.7 % at the level of the nucleotide sequence and between 89.5 and 98.1 % for the deduced amino acid sequences. In comparison to the phylogenetic trees of the complete coding sequences, trees based on partial segments revealed a general drop in phylogenetic support and a lower resolution. The Astrup strain S and M segment sequences showed the highest similarity to sequences of strains from geographically close sites in the Osnabrück Hills region. In conclusion, a primer-walking-mediated strategy resulted in the determination of the first complete nucleotide sequence of a PUUV strain from Central Europe. Different levels of variability along the genome provide the opportunity to choose regions for analyses according to the particular research question, e.g., large-scale phylogenetics or within-host evolution.
Resumo:
INTRODUCTION blaOXA-48, blaNDM-1 and blaCTX-M-3 are clinically relevant resistance genes, frequently associated with the broad-host range plasmids of the IncL/M group. The L and M plasmids belong to two compatible groups, which were incorrectly classified together by molecular methods. In order to understand their evolution, we fully sequenced four IncL/M plasmids, including the reference plasmids R471 and R69, the recently described blaOXA-48-carrying plasmid pKPN-El.Nr7 from a Klebsiella pneumoniae isolated in Bern (Switzerland), and the blaSHV-5 carrying plasmid p202c from a Salmonella enterica from Tirana (Albania). METHODS Sequencing was performed using 454 Junior Genome Sequencer (Roche). Annotation was performed using Sequin and Artemis software. Plasmid sequences were compared with 13 fully sequenced plasmids belonging to the IncL/M group available in GenBank. RESULTS Comparative analysis of plasmid genomes revealed two distinct genetic lineages, each containing one of the R471 (IncL) and R69 (IncM) reference plasmids. Conjugation experiments demonstrated that plasmids representative of the IncL and IncM groups were compatible with each other. The IncL group is constituted by the blaOXA-48-carrying plasmids and R471. The IncM group contains two sub-types of plasmids named IncM1 and IncM2 that are each incompatible. CONCLUSION This work re-defines the structure of the IncL and IncM families and ascribes a definitive designation to the fully sequenced IncL/M plasmids available in GenBank.
Resumo:
Whole-genome duplication approximately 108 years ago was proposed as an explanation for the many duplicated chromosomal regions in Saccharomyces cerevisiae. Here we have used computer simulations and analytic methods to estimate some parameters describing the evolution of the yeast genome after this duplication event. Computer simulation of a model in which 8% of the original genes were retained in duplicate after genome duplication, and 70–100 reciprocal translocations occurred between chromosomes, produced arrangements of duplicated chromosomal regions very similar to the map of real duplications in yeast. An analytical method produced an independent estimate of 84 map disruptions. These results imply that many smaller duplicated chromosomal regions exist in the yeast genome in addition to the 55 originally reported. We also examined the possibility of determining the original order of chromosomal blocks in the ancestral unduplicated genome, but this cannot be done without information from one or more additional species. If the genome sequence of one other species (such as Kluyveromyces lactis) were known it should be possible to identify 150–200 paired regions covering the whole yeast genome and to reconstruct approximately two-thirds of the original order of blocks of genes in yeast. Rates of interchromosome translocation in yeast and mammals appear similar despite their very different rates of homologous recombination per kilobase.
Resumo:
Despite more than a century of debate, the evolutionary position of turtles (Testudines) relative to other amniotes (reptiles, birds, and mammals) remains uncertain. One of the major impediments to resolving this important evolutionary problem is the highly distinctive and enigmatic morphology of turtles that led to their traditional placement apart from diapsid reptiles as sole descendants of presumably primitive anapsid reptiles. To address this question, the complete (16,787-bp) mitochondrial genome sequence of the African side-necked turtle (Pelomedusa subrufa) was determined. This molecule contains several unusual features: a (TA)n microsatellite in the control region, the absence of an origin of replication for the light strand in the WANCY region of five tRNA genes, an unusually long noncoding region separating the ND5 and ND6 genes, an overlap between ATPase 6 and COIII genes, and the existence of extra nucleotides in ND3 and ND4L putative ORFs. Phylogenetic analyses of the complete mitochondrial genome sequences supported the placement of turtles as the sister group of an alligator and chicken (Archosauria) clade. This result clearly rejects the Haematothermia hypothesis (a sister-group relationship between mammals and birds), as well as rejecting the placement of turtles as the most basal living amniotes. Moreover, evidence from both complete mitochondrial rRNA genes supports a sister-group relationship of turtles to Archosauria to the exclusion of Lepidosauria (tuatara, snakes, and lizards). These results challenge the classic view of turtles as the only survivors of primary anapsid reptiles and imply that turtles might have secondarily lost their skull fenestration.
Resumo:
We present an approach to map large numbers of Tc1 transposon insertions in the genome of Caenorhabditis elegans. Strains have been described that contain up to 500 polymorphic Tc1 insertions. From these we have cloned and shotgun sequenced over 2000 Tc1 flanks, resulting in an estimated set of 400 or more distinct Tc1 insertion alleles. Alignment of these sequences revealed a weak Tc1 insertion site consensus sequence that was symmetric around the invariant TA target site and reads CAYATATRTG. The Tc1 flanking sequences were compared with 40 Mbp of a C. elegans genome sequence. We found 151 insertions within the sequenced area, a density of ≈1 Tc1 insertion in every 265 kb. As the rest of the C. elegans genome sequence is obtained, remaining Tc1 alleles will fall into place. These mapped Tc1 insertions can serve two functions: (i) insertions in or near genes can be used to isolate deletion derivatives that have that gene mutated; and (ii) they represent a dense collection of polymorphic sequence-tagged sites. We demonstrate a strategy to use these Tc1 sequence-tagged sites in fine-mapping mutations.