939 resultados para bacteria genome nucleotide usage
Resumo:
Understanding the factors responsible for variations in mutation patterns and selection efficacy along chromosomes is a prerequisite for deciphering genome sequences. Population genetics models predict a positive correlation between the efficacy of selection at a given locus and the local rate of recombination because of Hill–Robertson effects. Codon usage is considered one of the most striking examples that support this prediction at the molecular level. In a wide range of species including Caenorhabditis elegans and Drosophila melanogaster, codon usage is essentially shaped by selection acting for translational efficiency. Codon usage bias correlates positively with recombination rate in Drosophila, apparently supporting the hypothesis that selection on codon usage is improved by recombination. Here we present an exhaustive analysis of codon usage in C. elegans and D. melanogaster complete genomes. We show that in both genomes there is a positive correlation between recombination rate and the frequency of optimal codons. However, we demonstrate that in both species, this effect is due to a mutational bias toward G and C bases in regions of high recombination rate, possibly as a direct consequence of the recombination process. The correlation between codon usage bias and recombination rate in these species appears to be essentially determined by recombination-dependent mutational patterns, rather than selective effects. This result highlights that it is necessary to take into account the mutagenic effect of recombination to understand the evolutionary role and impact of recombination.
Resumo:
We first review what is known about patterns of codon usage bias in Drosophila and make the following points: (i) Drosophila genes are as biased or more biased than those in microorganisms. (ii) The level of bias of genes and even the particular pattern of codon bias can remain phylogenetically invariant for very long periods of evolution. (iii) However, some genes, even very tightly linked genes, can change very greatly in codon bias across species. (iv) Generally G and especially C are favored at synonymous sites in biased genes. (v) With the exception of aspartic acid, all amino acids contribute significantly and about equally to the codon usage bias of a gene. (vi) While most individual amino acids that can use G or C at synonymous sites display a preference for C, there are exceptions: valine and leucine, which prefer G. (vii) Finally, smaller genes tend to be more biased than longer genes. We then examine possible causes of these patterns and discount mutation bias on three bases: there is little evidence of regional mutation bias in Drosophila, mutation bias is likely toward A+T (the opposite of codon usage bias), and not all amino acids display the preference for the same nucleotide in the wobble position. Two lines of evidence support a selection hypothesis based on tRNA pools: highly biased genes tend to be highly and/or rapidly expressed, and the preferred codons in highly biased genes optimally bind the most abundant isoaccepting tRNAs. Finally, we examine the effect of bias on DNA evolution and confirm that genes with high codon usage bias have lower rates of synonymous substitution between species than do genes with low codon usage bias. Surprisingly, we find that genes with higher codon usage bias display higher levels of intraspecific synonymous polymorphism. This may be due to opposing effects of recombination.
Resumo:
Simple phylogenetic tests were applied to a large data set of nucleotide sequences from two nuclear genes and a region of the mitochondrial genome of Trypanosoma cruzi, the agent of Chagas' disease. Incongruent gene genealogies manifest genetic exchange among distantly related lineages of T. cruzi. Two widely distributed isoenzyme types of T. cruzi are hybrids, their genetic composition being the likely result of genetic exchange between two distantly related lineages. The data show that the reference strain for the T. cruzi genome project (CL Brener) is a hybrid. Well-supported gene genealogies show that mitochondrial and nuclear gene sequences from T. cruzi cluster, respectively, in three or four distinct clades that do not fully correspond to the two previously defined major lineages of T. cruzi. There is clear genetic differentiation among the major groups of sequences, but genetic diversity within each major group is low. We estimate that the major extant lineages of T. cruzi have diverged during the Miocene or early Pliocene (3–16 million years ago).
Resumo:
The genome of the crenarchaeon Sulfolobus solfataricus P2 contains 2,992,245 bp on a single chromosome and encodes 2,977 proteins and many RNAs. One-third of the encoded proteins have no detectable homologs in other sequenced genomes. Moreover, 40% appear to be archaeal-specific, and only 12% and 2.3% are shared exclusively with bacteria and eukarya, respectively. The genome shows a high level of plasticity with 200 diverse insertion sequence elements, many putative nonautonomous mobile elements, and evidence of integrase-mediated insertion events. There are also long clusters of regularly spaced tandem repeats. Different transfer systems are used for the uptake of inorganic and organic solutes, and a wealth of intracellular and extracellular proteases, sugar, and sulfur metabolizing enzymes are encoded, as well as enzymes of the central metabolic pathways and motility proteins. The major metabolic electron carrier is not NADH as in bacteria and eukarya but probably ferredoxin. The essential components required for DNA replication, DNA repair and recombination, the cell cycle, transcriptional initiation and translation, but not DNA folding, show a strong eukaryal character with many archaeal-specific features. The results illustrate major differences between crenarchaea and euryarchaea, especially for their DNA replication mechanism and cell cycle processes and their translational apparatus.
Resumo:
Microbes whose genomes are encoded by DNA and for which adequate information is available display similar genomic mutation rates (average 0.0034 mutations per chromosome replication, range 0.0025 to 0.0046). However, this value currently is based on only a few well characterized microbes reproducing within a narrow range of environmental conditions. In particular, no genomic mutation rate has been determined either for a microbe whose natural growth conditions may extensively damage DNA or for any member of the archaea, a prokaryotic lineage deeply diverged from both bacteria and eukaryotes. Both of these conditions are met by the extreme thermoacidophile Sulfolobus acidocaldarius. We determined the genomic mutation rate for this species when growing at pH 3.5 and 75°C based on the rate of forward mutation at the pyrE gene and the nucleotide changes identified in 101 independent mutants. The observed value of about 0.0018 extends the range of DNA-based microbes with rates close to the standard rate simultaneously to an archaeon and to an extremophile whose cytoplasmic pH and normal growth temperature greatly accelerate the spontaneous decomposition of DNA. The mutations include base pair substitutions (BPSs) and additions and deletions of various sizes, but the S. acidocaldarius spectrum differs from those of other DNA-based organisms in being relatively poor in BPSs. The paucity of BPSs cannot yet be explained by known properties of DNA replication or repair enzymes of Sulfolobus spp. It suggests, however, that molecular evolution per genome replication may proceed more slowly in S. acidocaldarius than in other DNA-based organisms examined to date.
Resumo:
The complete nucleotide sequence, 5178 bp, of the totivirus Helminthosporium vicotoriae 190S virus (Hv190SV) double-stranded RNA, was determined. Computer-assisted sequence analysis revealed the presence of two large overlapping ORFs; the 5'-proximal large ORF (ORF1) codes for the coat protein (CP) with a predicted molecular mass of 81 kDa, and the 3'-proximal ORF (ORF2), which is in the -1 frame relative to ORF1, codes for an RNA-dependent RNA polymerase (RDRP). Unlike many other totiviruses, the overlap region between ORF1 and ORF2 lacks known structural information required for translational frameshifting. Using an antiserum to a C-terminal fragment of the RDRP, the product of ORF2 was identified as a minor virion-associated polypeptide of estimated molecular mass of 92 kDa. No CP-RDRP fusion protein with calculated molecular mass of 165 kDa was detected. The predicted start codon of the RDRP ORF (2605-AUG-2607) overlaps with the stop codon (2606-UGA-2608) of the CP ORF, suggesting RDRP is expressed by an internal initiation mechanism. Hv190SV is associated with a debilitating disease of its phytopathogenic fungal host. Knowledge of its genome organization and expression will be valuable for understanding its role in pathogenesis and for potential exploitation in the development of biocontrol measures.
Resumo:
The whole genome sequence (1.83 Mbp) of Haemophilus influenzae strain Rd was searched to identify tandem oligonucleotide repeat sequences. Loss or gain of one or more nucleotide repeats through a recombination-independent slippage mechanism is known to mediate phase variation of surface molecules of pathogenic bacteria, including H. influenzae. This facilitates evasion of host defenses and adaptation to the varying microenvironments of the host. We reasoned that iterative nucleotides could identify novel genes relevant to microbe-host interactions. Our search of the Rd genome sequence identified 9 novel loci with multiple (range 6-36, mean 22) tandem tetranucleotide repeats. All were found to be located within putative open reading frames and included homologues of hemoglobin-binding proteins of Neisseria, a glycosyltransferase (IgtC gene product) of Neisseria, and an adhesin of Yersinia. These tetranucleotide repeat sequences were also shown to be present in two other epidemiologically different H. influenzae type b strains, although the number and distribution of repeats was different. Further characterization of the IgtC gene showed that it was involved in phenotypic switching of a lipopolysaccharide epitope and that this variable expression was associated with changes in the number of tetranucleotide repeats. Mutation of IgtC resulted in attenuated virulence of H. influenzae in an infant rat model of invasive infection. These data indicate the rapidity, economy, and completeness with which whole genome sequences can be used to investigate the biology of pathogenic bacteria.
Resumo:
We have developed a system for generation of infectious bursal disease virus (IBDV), a segmented double-stranded RNA virus of the Birnaviridae family, with the use of synthetic transcripts derived from cloned cDNA. Independent full-length cDNA clones were constructed that contained the entire coding and noncoding regions of RNA segments A and B of two distinguishable IBDV strains of serotype I. Segment A encodes all of the structural (VP2, VP4, and VP3) and nonstructural (VP5) proteins, whereas segment B encodes the RNA-dependent RNA polymerase (VP1). Synthetic RNAs of both segments were produced by in vitro transcription of linearized plasmids with T7 RNA polymerase. Transfection of Vero cells with combined plus-sense transcripts of both segments generated infectious virus as early as 36 hr after transfection. The infectivity and specificity of the recovered chimeric virus was ascertained by the appearance of cytopathic effect in chicken embryo cells, by immunofluorescence staining of infected Vero cells with rabbit anti-IBDV serum, and by nucleotide sequence analysis of the recovered virus, respectively. In addition, transfectant viruses containing genetically tagged sequences in either segment A or segment B of IBDV were generated to confirm the feasibility of this system. The development of a reverse genetics system for double-stranded RNA viruses will greatly facilitate studies of the regulation of viral gene expression, pathogenesis, and design of a new generation of live vaccines.
Resumo:
The toil by photosynthesizing cyanobacteria and blue-green algae of nearly three billion years appeared to have finally resulted in the sufficient accumulation of molecular oxygen. So, the stage was set for the emergence, at the ocean bottom, of diverse animals that were consumers of molecular oxygen. It now appears that this Cambrian explosion, during which nearly all the extant animal phyla have emerged, was of an astonishingly short duration, lasting only 6-10 million years. Inasmuch as only a 1% DNA base sequence change is expected in 10 million years under the standard spontaneous mutation rate, I propose that all those diverse animals of the early Cambrian period, some 550 million years ago, were endowed with nearly identical genomes, with differential usage of the same set of genes accounting for the extreme diversities of body forms. Some of the more pertinent genes that are thought to be included in the Cambrian pananimalia genome are as follows. (i) A gene for lysyloxidase that, in the presence of molecular oxygen, crosslinked collagen triple helices to produce ligaments and tendons, thus contributing to the stout bodies of the Cambrian animals. (ii) Genes for hemoglobin; these internal transporters of molecular oxygen are today seen sporadically in members of diverse animal phyla. (iii) The Pax-6 gene for eye formation; the eyes of a ribbon worm to a human are organized by this gene. In animals without eyes, the same gene organizes other sensory systems and organs. (iv) A series of Hox genes for the anterior-posterior (cranio-caudal) body plans: these genes are also present in all phyla of the kingdom Animalia.
Resumo:
Nucleoside diphosphate (NDP) kinase is a ubiquitous nonspecific enzyme that evidently is designed to catalyze in vivo ATP-dependent synthesis of ribo- and deoxyribonucleoside triphosphates from the corresponding diphosphates. Because Escherichia coli contains only one copy of ndk, the structural gene for this enzyme, we were surprised to find that ndk disruption yields bacteria that are still viable. These mutant cells contain a protein with a small amount NDP kinase activity. The protein responsible for this activity was purified and identified as adenylate kinase. This enzyme, also called myokinase, catalyzes the reversible ATP-dependent synthesis of ADP from AMP. We found that this enzyme from E. coli as well as from higher eukaryotes has a broad substrate specificity displaying dual enzymatic functions. Among the nucleoside monophosphate kinases tested, only adenylate kinase was found to have NDP kinase activity. To our knowledge, this is the first report of NDP kinase activity associated with adenylate kinase.
Novel human DNA alkyltransferases obtained by random substitution and genetic selection in bacteria.
Resumo:
DNA repair alkyltransferases protect organisms against the cytotoxic, mutagenic, and carcinogenic effects of alkylating agents by transferring alkyl adducts from DNA to an active cysteine on the protein, thereby restoring the native DNA structure. We used random sequence substitutions to gain structure-function information about the human O6-methylguanine-DNA methyltransferase (EC 2.1.1.63), as well as to create active mutants. Twelve codons surrounding but not including the active cysteine were replaced by a random nucleotide sequence, and the resulting random library was selected for the ability to provide alkyltransferase-deficient Escherichia coli with resistance to the methylating agent N-methyl-N'-nitro-N-nitrosoguanidine. Few amino acid changes were tolerated in this evolutionarily conserved region of the protein. One mutation, a valine to phenylalanine change at codon 139 (V139F), was found in 70% of the selected mutants; in fact, this mutant was selected much more frequently than the wild type. V139F provided alkyltransferase-deficient bacteria with greater protection than the wild-type protein against both the cytotoxic and mutagenic effects of N-methyl-N'-nitro-N-nitrosoguanidine, increasing the D37 over 4-fold and reducing the mutagenesis rate 2.7-5.5-fold. This mutant human alkyltransferase, or others similarly created and selected, could be used to protect bone marrow cells from the cytotoxic side effects of alkylation-based chemotherapeutic regimens.
Resumo:
Mapping the insertion points of 16 signature-tagged transposon mutants on the Salmonella typhimurium chromosome led to the identification of a 40-kb virulence gene cluster at minute 30.7. This locus is conserved among all other Salmonella species examined but is not present in a variety of other pathogenic bacteria or in Escherichia coli K-12. Nucleotide sequencing of a portion of this locus revealed 11 open reading frames whose predicted proteins encode components of a type III secretion system. To distinguish between this and the type III secretion system encoded by the inv/spa invasion locus known to reside on a pathogenicity island, we refer to the inv/spa locus as Salmonella pathogenicity island (SPI) 1 and the new locus as SPI2. SPI2 has a lower G+C content than that of the remainder of the Salmonella genome and is flanked by genes whose products share greater than 90% identity with those of the E. coli ydhE and pykF genes. Thus SPI2 was probably acquired horizontally by insertion into a region corresponding to that between the ydhE and pykF genes of E. coli. Virulence studies of SPI2 mutants have shown them to be attenuated by at least five orders of magnitude compared with the wild-type strain after oral or intraperitoneal inoculation of mice.
Resumo:
The genetic code is based on aminoacylation reactions where specific amino acids are attached to tRNAs bearing anticodon trinucleotides. However, the anticodon-independent specific aminoacylation of RNA minihelix substrates by bacterial and yeast tRNA synthetases suggested an operational RNA code for amino acids whereby specific RNA sequences/structures in tRNA acceptor stems correspond to specific amino acids. Because of the possible significance of the operational RNA code for the development of the genetic code, we investigated aminoacylation of synthetic RNA minihelices with a human enzyme to understand the sequences needed for that aminoacylation compared with those needed for a microbial system. We show here that the species-specific aminoacylation of glycine tRNAs is recapitulated by a species-specific aminoacylation of minihelices. Although the mammalian and Escherichia coli minihelices differ at 6 of 12 base pairs, two of the three nucleotides essential for aminoacylation by the E. coli enzyme are conserved in the mammalian minihelix. The two conserved nucleotides were shown to be also important for aminoacylation of the mammalian minihelix by the human enzyme. A simple interchange of the differing nucleotide enabled the human enzyme to now charge the bacterial substrate and not the mammalian minihelix. Conversely, this interchange made the bacterial enzyme specific for the mammalian substrate. Thus, the positional locations (if not the actual nucleotides) for the operational RNA code for glycine appear conserved from bacteria to mammals.
Resumo:
Chromosome I from the yeast Saccharomyces cerevisiae contains a DNA molecule of approximately 231 kbp and is the smallest naturally occurring functional eukaryotic nuclear chromosome so far characterized. The nucleotide sequence of this chromosome has been determined as part of an international collaboration to sequence the entire yeast genome. The chromosome contains 89 open reading frames and 4 tRNA genes. The central 165 kbp of the chromosome resembles other large sequenced regions of the yeast genome in both its high density and distribution of genes. In contrast, the remaining sequences flanking this DNA that comprise the two ends of the chromosome and make up more than 25% of the DNA molecule have a much lower gene density, are largely not transcribed, contain no genes essential for vegetative growth, and contain several apparent pseudogenes and a 15-kbp redundant sequence. These terminally repetitive regions consist of a telomeric repeat called W', flanked by DNA closely related to the yeast FLO1 gene. The low gene density, presence of pseudogenes, and lack of expression are consistent with the idea that these terminal regions represent the yeast equivalent of heterochromatin. The occurrence of such a high proportion of DNA with so little information suggests that its presence gives this chromosome the critical length required for proper function.
Resumo:
The 23S rRNA-targeted probes GAM42a and BET42a provided equivocal results with the uncultured gammaproteobacterium 'Candidatus Competibacter phosphatis' where some cells bound GAM42a and other cells bound BET42a in fluorescence in situ hybridization (FISH) experiments. Probes GAM42a and BET42a span positions 1027-1043 in the 23S rRNAand differ from each other by one nucleotide at position 1033. Clone libraries were prepared from PCR products spanning the 16S rRNA genes, intergenic spacer region and 23S rRNA genes from two mixed cultures enriched in 'Candidatus C. phosphatis'. With individual clone inserts, the 16S rDNA portion was used to confirm the source organism as 'Candidatus C. phosphatis' and the 23S rDNA portion was used to determine the sequence of the GAM42a/BET42a probe target region. Of the 19 clones sequenced, 8 had the GAM42a probe target (T at position 1033) and 11 had G at position 1033, the only mismatch with GAM42a. However, none of the clones had the BET42a probe target (A at 1033). Non-canonical base-pairing between the 23S rRNA of 'Candidatus C. phosphatis' with G at position 1033 and GAM42a (G-A) or BET42a (G-T) is likely to explain the probing anomalies. A probe (GAM42_C1033) was optimized for use in FISH, targeting cells with G at position 1033, and was found to highlight not only some 'Candidatus C. phosphatis' cells, but also other bacteria. This demonstrates that there are bacteria in addition to 'Candidatus C. phosphatis' with the GAM42_C1033 probe target and not the BET42a or GAM42a probe target.