952 resultados para Prokaryotic Genomes
Resumo:
Evolutionary history of biological entities is recorded within their nucleic acid sequences and can (sometimes) be deciphered by thorough genomic analysis. In this study we sought to gain insights into the diversity and evolution of bacterial and archaeal viruses. Our primary interest was pointed towards those virus groups/families for which comprehensive genomic analysis was not previously possible due to the lack of sufficient amount of genomic data. During the course of this work twenty-five putative proviruses integrated into various prokaryotic genomes were identified, enabling us to undertake a comparative genomics approach. This analysis allowed us to test the previously formulated evolutionary hypotheses and also provided valuable information on the molecular mechanisms behind the genome evolution of the studied virus groups.
Resumo:
Increasingly, studies of genes and genomes are indicating that considerable horizontal transfer has occurred between prokaryotes. Extensive horizontal transfer has occurred for operational genes (those involved in housekeeping), whereas informational genes (those involved in transcription, translation, and related processes) are seldomly horizontally transferred. Through phylogenetic analysis of six complete prokaryotic genomes and the identification of 312 sets of orthologous genes present in all six genomes, we tested two theories describing the temporal flow of horizontal transfer. We show that operational genes have been horizontally transferred continuously since the divergence of the prokaryotes, rather than having been exchanged in one, or a few, massive events that occurred early in the evolution of prokaryotes. In agreement with earlier studies, we found that differences in rates of evolution between operational and informational genes are minimal, suggesting that factors other than rate of evolution are responsible for the observed differences in horizontal transfer. We propose that a major factor in the more frequent horizontal transfer of operational genes is that informational genes are typically members of large, complex systems, whereas operational genes are not, thereby making horizontal transfer of informational gene products less probable (the complexity hypothesis).
Resumo:
Lateral gene transfer (LGT) is considered as one of the drivers in bacterial genome evolution, usually associated with increased fitness and/or changes in behavior, especially if one considers pathogenic vs. non-pathogenic bacterial groups. The genomes of two phytopathogens, Xanthomonas campestris pv. campestris and Xanthomonas axonopodis pv. citri, were previously inspected for genome islands originating from LGT events, and, in this work, potentially early and late LGT events were identified according to their altered nucleotide composition. The biological role of the islands was also assessed, and pathogenicity, virulence and secondary metabolism pathways were functions highly represented, especially in islands that were found to be recently transferred. However, old islands are composed of a high proportion of genes related to cell primary metabolic functions. These old islands, normally undetected by traditional atypical composition analysis, but confirmed as product of LGT by atypical phylogenetic reconstruction, reveal the role of LGT events by replacing core metabolic genes normally inherited by vertical processes.
Resumo:
The biosynthesis of quinolinate, the de novo precursor of nicotinamide adenine dinucleotide (NAD), may be performed by two distinct pathways, namely, the bacterial aspartate (aspartate-to-quinolinate) and the eukaryotic kynurenine (tryptophan-to-quinolinate). Even though the separation into eukaryotic and bacterial routes is long established, recent genomic surveys have challenged this view, because certain bacterial species also carry the genes for the kynurenine pathway. In this work, both quinolinate biosynthetic pathways were investigated in the Bacteria clade and with special attention to Xanthomonadales and Bacteroidetes, from an evolutionary viewpoint. Genomic screening has revealed that a small number of bacterial species possess some of the genes for the kynurenine pathway, which is complete in the genus Xanthomonas and in the order Flavobacteriales, where the aspartate pathway is absent. The opposite pattern (presence of the aspartate pathway and absence of the kynurenine pathway) in close relatives (Xylella ssp. and the order Bacteroidales, respectively) points to the idea of a recent acquisition of the kynurenine pathway through lateral gene transfer in these bacterial groups. In fact, sequence similarity comparison and phylogenetic reconstruction both suggest that at least part of the genes of the kynurenine pathway in Xanthomonas and Flavobacteriales is shared by eukaryotes. These results reinforce the idea of the role that lateral gene transfer plays in the configuration of bacterial genomes, thereby providing alternative metabolic pathways, even with the replacement of primary and essential cell functions, as exemplified by NAD biosynthesis.
Resumo:
Background: Decreasing costs of DNA sequencing have made prokaryotic draft genome sequences increasingly common. A contig scaffold is an ordering of contigs in the correct orientation. A scaffold can help genome comparisons and guide gap closure efforts. One popular technique for obtaining contig scaffolds is to map contigs onto a reference genome. However, rearrangements that may exist between the query and reference genomes may result in incorrect scaffolds, if these rearrangements are not taken into account. Large-scale inversions are common rearrangement events in prokaryotic genomes. Even in draft genomes it is possible to detect the presence of inversions given sufficient sequencing coverage and a sufficiently close reference genome. Results: We present a linear-time algorithm that can generate a set of contig scaffolds for a draft genome sequence represented in contigs given a reference genome. The algorithm is aimed at prokaryotic genomes and relies on the presence of matching sequence patterns between the query and reference genomes that can be interpreted as the result of large-scale inversions; we call these patterns inversion signatures. Our algorithm is capable of correctly generating a scaffold if at least one member of every inversion signature pair is present in contigs and no inversion signatures have been overwritten in evolution. The algorithm is also capable of generating scaffolds in the presence of any kind of inversion, even though in this general case there is no guarantee that all scaffolds in the scaffold set will be correct. We compare the performance of SIS, the program that implements the algorithm, to seven other scaffold-generating programs. The results of our tests show that SIS has overall better performance. Conclusions: SIS is a new easy-to-use tool to generate contig scaffolds, available both as stand-alone and as a web server. The good performance of SIS in our tests adds evidence that large-scale inversions are widespread in prokaryotic genomes.
Resumo:
Abstract Background DNA repair genes encode proteins that protect organisms against genetic damage generated by environmental agents and by-products of cell metabolism. The importance of these genes in life maintenance is supported by their high conservation, and the presence of duplications of such genes may be easily traced, especially in prokaryotic genomes. Results The genome sequences of two Xanthomonas species were used as the basis for phylogenetic analyses of genes related to DNA repair that were found duplicated. Although 16S rRNA phylogenetic analyses confirm their classification at the basis of the gamma proteobacteria subdivision, differences were found in the origin of the various genes investigated. Except for lexA, detected as a recent duplication, most of the genes in more than one copy are represented by two highly divergent orthologs. Basically, one of such duplications is frequently positioned close to other gamma proteobacteria, but the second is often positioned close to unrelated bacteria. These orthologs may have occurred from old duplication events, followed by extensive gene loss, or were originated from lateral gene transfer (LGT), as is the case of the uvrD homolog. Conclusions Duplications of DNA repair related genes may result in redundancy and also improve the organisms' responses to environmental challenges. Most of such duplications, in Xanthomonas, seem to have arisen from old events and possibly enlarge both functional and evolutionary genome potentiality.
Resumo:
Motivation An actual issue of great interest, both under a theoretical and an applicative perspective, is the analysis of biological sequences for disclosing the information that they encode. The development of new technologies for genome sequencing in the last years, opened new fundamental problems since huge amounts of biological data still deserve an interpretation. Indeed, the sequencing is only the first step of the genome annotation process that consists in the assignment of biological information to each sequence. Hence given the large amount of available data, in silico methods became useful and necessary in order to extract relevant information from sequences. The availability of data from Genome Projects gave rise to new strategies for tackling the basic problems of computational biology such as the determination of the tridimensional structures of proteins, their biological function and their reciprocal interactions. Results The aim of this work has been the implementation of predictive methods that allow the extraction of information on the properties of genomes and proteins starting from the nucleotide and aminoacidic sequences, by taking advantage of the information provided by the comparison of the genome sequences from different species. In the first part of the work a comprehensive large scale genome comparison of 599 organisms is described. 2,6 million of sequences coming from 551 prokaryotic and 48 eukaryotic genomes were aligned and clustered on the basis of their sequence identity. This procedure led to the identification of classes of proteins that are peculiar to the different groups of organisms. Moreover the adopted similarity threshold produced clusters that are homogeneous on the structural point of view and that can be used for structural annotation of uncharacterized sequences. The second part of the work focuses on the characterization of thermostable proteins and on the development of tools able to predict the thermostability of a protein starting from its sequence. By means of Principal Component Analysis the codon composition of a non redundant database comprising 116 prokaryotic genomes has been analyzed and it has been showed that a cross genomic approach can allow the extraction of common determinants of thermostability at the genome level, leading to an overall accuracy in discriminating thermophilic coding sequences equal to 95%. This result outperform those obtained in previous studies. Moreover, we investigated the effect of multiple mutations on protein thermostability. This issue is of great importance in the field of protein engineering, since thermostable proteins are generally more suitable than their mesostable counterparts in technological applications. A Support Vector Machine based method has been trained to predict if a set of mutations can enhance the thermostability of a given protein sequence. The developed predictor achieves 88% accuracy.
Resumo:
Genetic instability in mammalian cells can occur by many different mechanisms. In the absence of exogenous sources of DNA damage, the DNA structure itself has been implicated in genetic instability. When the canonical B-DNA helix is naturally altered to form a non-canonical DNA structure such as a Z-DNA or H-DNA, this can lead to genetic instability in the form of DNA double-strand breaks (DSBs) (1, 2). Our laboratory found that the stability of these non-B DNA structures was different in mammals versus Escherichia coli (E.coli) bacteria (1, 2). One explanation for the difference between these species may be a result of how DSBs are repaired within each species. Non-homologous end-joining (NHEJ) is primed to repair DSBs in mammalian cells, while bacteria that lack NHEJ (such as E.coli), utilize homologous recombination (HR) to repair DSBs. To investigate the role of the error-prone NHEJ repair pathway in DNA structure-induced genetic instability, E.coli cells were modified to express genes to allow for a functional NHEJ system under different HR backgrounds. The Mycobacterium tuberculosis NHEJ sufficient system is composed of Ku and Ligase D (LigD) (3). These inducible NHEJ components were expressed individually and together in E.coli cells, with or without functional HR (RecA/RecB), and the Z-DNA and H-DNA-induced mutations were characterized. The Z-DNA structure gave rise to higher mutation frequencies compared to the controls, regardless of the DSB repair pathway(s) available; however, the type of mutants produced after repair was greatly dictated on the available DSB repair system, indicated by the shift from 2% large-scale deletions in the total mutant population to 24% large-scale deletions when NHEJ was present (4). This suggests that NHEJ has a role in the large deletions induced by Z-DNA-forming sequences. H-DNA structure, however, did not exhibit an increase in mutagenesis in the newly engineered E.coli environment, suggesting the involvement of other factors in regulating H-DNA formation/stability in bacterial cells. Accurate repair by established DNA DSB repair pathways is essential to maintain the stability of eukaryotic and prokaryotic genomes and our results suggest that an error-prone NHEJ pathway was involved in non-B DNA structure-induced mutagenesis in both prokaryotes and eukaryotes.
Resumo:
Analyses of complete genomes indicate that a massive prokaryotic gene transfer (or transfers) preceded the formation of the eukaryotic cell. In comparisons of the entire set of Methanococcus jannaschii genes with their orthologs from Escherichia coli, Synechocystis 6803, and the yeast Saccharomyces cerevisiae, it is shown that prokaryotic genomes consist of two different groups of genes. The deeper, diverging informational lineage codes for genes which function in translation, transcription, and replication, and also includes GTPases, vacuolar ATPase homologs, and most tRNA synthetases. The more recently diverging operational lineage codes for amino acid synthesis, the biosynthesis of cofactors, the cell envelope, energy metabolism, intermediary metabolism, fatty acid and phospholipid biosynthesis, nucleotide biosynthesis, and regulatory functions. In eukaryotes, the informational genes are most closely related to those of Methanococcus, whereas the majority of operational genes are most closely related to those of Escherichia, but some are closest to Methanococcus or to Synechocystis.
Resumo:
With more than 10 fully sequenced, publicly available prokaryotic genomes, it is now becoming possible to gain useful insights into genome evolution. Before the genome era, many evolutionary processes were evaluated from limited data sets and evolutionary models were constructed on the basis of small amounts of evidence. In this paper, I show that genes on the Borrelia burgdorferi genome have two separate, distinct, and significantly different codon usages, depending on whether the gene is transcribed on the leading or lagging strand of replication. Asymmetrical replication is the major source of codon usage variation. Replicational selection is responsible for the higher number of genes on the leading strands, and transcriptional selection appears to be responsible for the enrichment of highly expressed genes on these strands. Replicational–transcriptional selection, therefore, has an influence on the codon usage of a gene. This is a new paradigm of codon selection in prokaryotes.
Resumo:
The Ribosomal RNA Operon Copy Number Database (rrndb) is an Internet-accessible database containing annotated information on rRNA operon copy number among prokaryotes. Gene redundancy is uncommon in prokaryotic genomes, yet the rRNA genes can vary from one to as many as 15 copies. Despite the widespread use of 16S rRNA gene sequences for identification of prokaryotes, information on the number and sequence of individual rRNA genes in a genome is not readily accessible. In an attempt to understand the evolutionary implications of rRNA operon redundancy, we have created a phylogenetically arranged report on rRNA gene copy number for a diverse collection of prokaryotic microorganisms. Each entry (organism) in the rrndb contains detailed information linked directly to external websites including the Ribosomal Database Project, GenBank, PubMed and several culture collections. Data contained in the rrndb will be valuable to researchers investigating microbial ecology and evolution using 16S rRNA gene sequences. The rrndb web site is directly accessible on the WWW at http://rrndb.cme.msu.edu.
Resumo:
Predicted highly expressed (PHX) and putative alien genes determined by codon usages are characterized in the genome of Deinococcus radiodurans (strain R1). Deinococcus radiodurans (DEIRA) can survive very high doses of ionizing radiation that are lethal to virtually all other organisms. It has been argued that DEIRA is endowed with enhanced repair systems that provide protection and stability. However, predicted expression levels of DNA repair proteins with the exception of RecA tend to be low and do not distinguish DEIRA from other prokaryotes. In this paper, the capability of DEIRA to resist extreme doses of ionizing and UV radiation is attributed to an unusually high number of PHX chaperone/degradation, protease, and detoxification genes. Explicitly, compared with all current complete prokaryotic genomes, DEIRA contains the greatest number of PHX detoxification and protease proteins. Other sources of environmental protection against severe conditions of UV radiation, desiccation, and thermal effects for DEIRA are the several S-layer (surface structure) PHX proteins. The top PHX gene of DEIRA is the multifunctional tricarboxylic acid (TCA) gene aconitase, which, apart from its role in respiration, also alerts the cell to oxidative damage.
Resumo:
Gene homologs of GlnK PII regulators and AmtB-type ammonium transporters are often paired on prokaryotic genomes, suggesting these proteins share an ancient functional relationship. Here, we demonstrate for the first time in Archaea that GlnK associates with AmtB in membrane fractions after ammonium shock, thus, providing a further insight into GlnK-AmtB as an ancient nitrogen sensor pair. For this work, Haloferax mediterranei was advanced for study through the generation of a pyrE2-based counterselection system that was used for targeted gene deletion and expression of Flag-tagged proteins from their native promoters. AmtB1-Flag was detected in membrane fractions of cells grown on nitrate and was found to coimmunoprecipitate with GlnK after ammonium shock. Thus, in analogy to bacteria, the archaeal GlnK PII may block the AmtB1 ammonium transporter under nitrogen-rich conditions. In addition to this regulated protein–protein interaction, the archaeal amtB-glnK gene pairs were found to be highly regulated by nitrogen availability with transcript levels high under conditions of nitrogen limitation and low during nitrogen excess. While transcript levels of glnK-amtB are similarly regulated by nitrogen availability in bacteria, transcriptional regulators of the bacterial glnK promoter including activation by the two-component signal transduction proteins NtrC (GlnG, NRI) and NtrB (GlnL, NRII) and sigma factor σN (σ54) are not conserved in archaea suggesting a novel mechanism of transcriptional control.
Resumo:
Non-tree-based ('surrogate') methods have been used to identify instances of lateral genetic transfer in microbial genomes but agreement among predictions of different methods can be poor. It has been proposed that this disagreement arises because different surrogate methods are biased towards the detection of certain types of transfer events. This conjecture is supported by a rigorous phylogenetic analysis of 3776 proteins in Escherichia coli K12 MG1655 to map the ages of transfer events relative to one another.
Resumo:
Background:Overwhelming majority of the Serine/Threonine protein kinases identified by gleaning archaeal and eubacterial genomes could not be classified into any of the well known Hanks and Hunter subfamilies of protein kinases. This is owing to the development of Hanks and Hunter classification scheme based on eukaryotic protein kinases which are highly divergent from their prokaryotic homologues. A large dataset of prokaryotic Serine/Threonine protein kinases recognized from genomes of prokaryotes have been used to develop a classification framework for prokaryotic Ser/Thr protein kinases. Methodology/Principal Findings: We have used traditional sequence alignment and phylogenetic approaches and clustered the prokaryotic kinases which represent 72 subfamilies with at least 4 members in each. Such a clustering enables classification of prokaryotic Ser/Thr kinases and it can be used as a framework to classify newly identified prokaryotic Ser/Thr kinases. After series of searches in a comprehensive sequence database we recognized that 38 subfamilies of prokaryotic protein kinases are associated to a specific taxonomic level. For example 4, 6 and 3 subfamilies have been identified that are currently specific to phylum proteobacteria, cyanobacteria and actinobacteria respectively. Similarly subfamilies which are specific to an order, sub-order, class, family and genus have also been identified. In addition to these, we also identify organism-diverse subfamilies. Members of these clusters are from organisms of different taxonomic levels, such as archaea, bacteria, eukaryotes and viruses.Conclusion/Significance: Interestingly, occurrence of several taxonomic level specific subfamilies of prokaryotic kinases contrasts with classification of eukaryotic protein kinases in which most of the popular subfamilies of eukaryotic protein kinases occur diversely in several eukaryotes. Many prokaryotic Ser/Thr kinases exhibit a wide variety of modular organization which indicates a degree of complexity and protein-protein interactions in the signaling pathways in these microbes.