9 resultados para bacteria genome nucleotide usage
em University of Queensland eSpace - Australia
Resumo:
Enhanced biological phosphorus removal (EBPR) is one of the best-studied microbially mediated industrial processes because of its ecological and economic relevance. Despite this, it is not well understood at the metabolic level. Here we present a metagenomic analysis of two lab-scale EBPR sludges dominated by the uncultured bacterium, Candidatus Accumulibacter phosphatis.'' The analysis sheds light on several controversies in EBPR metabolic models and provides hypotheses explaining the dominance of A. phosphatis in this habitat, its lifestyle outside EBPR and probable cultivation requirements. Comparison of the same species from different EBPR sludges highlights recent evolutionary dynamics in the A. phosphatis genome that could be linked to mechanisms for environmental adaptation. In spite of an apparent lack of phylogenetic overlap in the flanking communities of the two sludges studied, common functional themes were found, at least one of them complementary to the inferred metabolism of the dominant organism. The present study provides a much needed blueprint for a systems-level understanding of EBPR and illustrates that metagenomics enables detailed, often novel, insights into even well-studied biological systems.
Resumo:
High-quality data about protein structures and their gene sequences are essential to the understanding of the relationship between protein folding and protein coding sequences. Firstly we constructed the EcoPDB database, which is a high-quality database of Escherichia coli genes and their corresponding PDB structures. Based on EcoPDB, we presented a novel approach based on information theory to investigate the correlation between cysteine synonymous codon usages and local amino acids flanking cysteines, the correlation between cysteine synonymous codon usages and synonymous codon usages of local amino acids flanking cysteines, as well as the correlation between cysteine synonymous codon usages and the disulfide bonding states of cysteines in the E. coli genome. The results indicate that the nearest neighboring residues and their synonymous codons of the C-terminus have the greatest influence on the usages of the synonymous codons of cysteines and the usage of the synonymous codons has a specific correlation with the disulfide bond formation of cysteines in proteins. The correlations may result from the regulation mechanism of protein structures at gene sequence level and reflect the biological function restriction that cysteines pair to form disulfide bonds. The results may also be helpful in identifying residues that are important for synonymous codon selection of cysteines to introduce disulfide bridges in protein engineering and molecular biology. The approach presented in this paper can also be utilized as a complementary computational method and be applicable to analyse the synonymous codon usages in other model organisms. (c) 2005 Elsevier Ltd. All rights reserved.
Resumo:
The 23S rRNA-targeted probes GAM42a and BET42a provided equivocal results with the uncultured gammaproteobacterium 'Candidatus Competibacter phosphatis' where some cells bound GAM42a and other cells bound BET42a in fluorescence in situ hybridization (FISH) experiments. Probes GAM42a and BET42a span positions 1027-1043 in the 23S rRNAand differ from each other by one nucleotide at position 1033. Clone libraries were prepared from PCR products spanning the 16S rRNA genes, intergenic spacer region and 23S rRNA genes from two mixed cultures enriched in 'Candidatus C. phosphatis'. With individual clone inserts, the 16S rDNA portion was used to confirm the source organism as 'Candidatus C. phosphatis' and the 23S rDNA portion was used to determine the sequence of the GAM42a/BET42a probe target region. Of the 19 clones sequenced, 8 had the GAM42a probe target (T at position 1033) and 11 had G at position 1033, the only mismatch with GAM42a. However, none of the clones had the BET42a probe target (A at 1033). Non-canonical base-pairing between the 23S rRNA of 'Candidatus C. phosphatis' with G at position 1033 and GAM42a (G-A) or BET42a (G-T) is likely to explain the probing anomalies. A probe (GAM42_C1033) was optimized for use in FISH, targeting cells with G at position 1033, and was found to highlight not only some 'Candidatus C. phosphatis' cells, but also other bacteria. This demonstrates that there are bacteria in addition to 'Candidatus C. phosphatis' with the GAM42_C1033 probe target and not the BET42a or GAM42a probe target.
Resumo:
Translational pausing may occur due to a number of mechanisms, including the presence of non-optimal codons, and it is thought to play a role in the folding of specific polypeptide domains during translation and in the facilitation of signal peptide recognition during see-dependent protein targeting. In this whole genome analysis of Escherichia coli we have found that non-optimal codons in the signal peptide-encoding sequences of secretory genes are overrepresented relative to the mature portions of these genes; this is in addition to their overrepresentation in the 5'-regions of genes encoding non-secretory proteins. We also find increased non-optimal codon usage at the 3' ends of most E. coli genes, in both non-secretory and secretory sequences. Whereas presumptive translational pausing at the 5' and 3' ends of E. coli messenger RNAs may clearly have a general role in translation, we suggest that it also has a specific role in sec-dependent protein export, possibly in facilitating signal peptide recognition. This finding may have important implications for our understanding of how the majority of non-cytoplasmic proteins are targeted, a process that is essential to all biological cells. (C) 2004 Elsevier Inc. All rights reserved.
Resumo:
Sequence diversity in the coat protein coding region of Australian strains of Johnsongrass mosaic virus (JGMV) was investigated. Field isolates were sampled during a seven year period from Johnsongrass, sorghum and corn across the northern grain growing region. The 23 isolates were found to have greater than 94% nucleotide and amino acid sequence identity. The Australian isolates and two strains from the U.S.A. had about 90% nucleotide sequence identity and were between 19 and 30% different in the N-terminus of the coat protein. Two amino acid residues were found in the core region of the coat protein in isolates obtained from sorghum having the Krish gene for JGMV resistance that differed from those found in isolates from other hosts which did not have this single dominant resistance gene. These amino acid changes may have been responsible for overcoming the resistance conferred by the Krish gene for JGMV resistance in sorghum. The identification of these variable regions was essential for the development of durable pathogen-derived resistance to JGMV in sorghum.
Resumo:
We completed the genome sequence of Lettuce necrotic yellows virus (LNYV) by determining the nucleotide sequences of the 4a (putative phosphoprotein), 4b, M (matrix protein), G (glycoprotein) and L (polymerase) genes. The genome consists of 12,807 nucleotides and encodes six genes in the order 3' leader-N-4a(P)-4b-M-G-L-5' trailer. Sequences were derived from clones of a cDNA library from LNYV genomic RNA and from fragments amplified using reverse transcription-polymerase chain reaction. The 4a protein has a low isoelectric point characteristic for rhabdovirus phosphoproteins. The 4b protein has significant sequence similarities with the movement proteins of capillo- and trichoviruses and may be involved in cell-to-cell movement. The putative G protein sequence contains a predicted 25 amino acids signal peptide and endopeptidase cleavage site, three predicted glycosylation sites and a putative transmembrane domain. The deduced L protein sequence shows similarities with the L proteins of other plant rhabdoviruses and contains polymerase module motifs characteristic for RNA-dependent RNA polymerases of negative-strand RNA viruses. Phylogenetic analysis of this motif among rhabdoviruses placed LNYV in a group with other sequenced cytorhabdoviruses, most closely related to Strawberry crinkle virus. (c) 2005 Elsevier B.V. All rights reserved.
Resumo:
Leptospirosis is one of the most common zoonotic diseases in the world, resulting in high morbidity and mortality in humans and affecting global livestock production. Most infections are caused by either Leptospira borgpetersenii or Leptospira interrogans, bacteria that vary in their distribution in nature and rely on different modes of transmission. We report the complete genomic sequences of two strains of L. borgpetersenii serovar Hardjo that have distinct phenotypes and virulence. These two strains have nearly identical genetic content, with subtle frameshift and point mutations being a common form of genetic variation. Starkly limited regions of synteny are shared between the large chromosomes of L. borgpetersenii and L. interrogans, probably the result of frequent recombination events between insertion sequences. The L. borgpetersenii genome is ≈700 kb smaller and has a lower coding density than L. interrogans, indicating it is decaying through a process of insertion sequence-mediated genome reduction. Loss of gene function is not random but is centered on impairment of environmental sensing and metabolite transport and utilization. These features distinguish L. borgpetersenii from L. interrogans, a species with minimal genetic decay and that survives extended passage in aquatic environments encountering a mammalian host. We conclude that L. borgpetersenii is evolving toward dependence on a strict host-to-host transmission cycle.
Resumo:
Mammalian promoters can be separated into two classes, conserved TATA box-enriched promoters, which initiate at a welldefined site, and more plastic, broad and evolvable CpG-rich promoters. We have sequenced tags corresponding to several hundred thousand transcription start sites (TSSs) in the mouse and human genomes, allowing precise analysis of the sequence architecture and evolution of distinct promoter classes. Different tissues and families of genes differentially use distinct types of promoters. Our tagging methods allow quantitative analysis of promoter usage in different tissues and show that differentially regulated alternative TSSs are a common feature in protein-coding genes and commonly generate alternative N termini. Among the TSSs, we identified new start sites associated with the majority of exons and with 3' UTRs. These data permit genome-scale identification of tissue-specific promoters and analysis of the cis-acting elements associated with them.
Resumo:
Full-length genome sequences of five virulent and five avirulent strains of Newcastle disease virus isolated between 1998 and 2002 in Victoria and New South Wales, Australia were determined. Comparisons between these strains revealed that coding sequence variability in the haemagglutinin-neuraminidase (HN), matrix (M) and phosphoprotein (P) gene sequences appeared to be more variable than in the fusion (F), nucleocapsid (N) and RNA dependent-RNA replicase (L) genes. Sequence analysis of a number of other isolates made during the recent virulent NDV outbreaks, also identified the presence of a number of variants with altered F gene cleavage sites, which resulted in altered biological properties of those viruses. Quasispecies analysis of a number of field isolates indicated the presence of virulent virus in one particular isolate. Gene sequence analysis of the progenitor virus isolated in 1998 showed very little sequence variation when compared to that of a progenitor-like virus isolated in 2001 demonstrating that in the field. viral genome sequence variation appears to be biologically restricted to that of a consensus sequence. (c) 2005 Elsevier B.V. All rights reserved.