33 resultados para Orfs
Resumo:
A DNA sequence has been obtained for a 35.6-kb genomic segment from Heliobacillus mobilis that contains a major cluster of photosynthesis genes. A total of 30 ORFs were identified, 20 of which encode enzymes for bacteriochlorophyll and carotenoid biosynthesis, reaction-center (RC) apoprotein, and cytochromes for cyclic electron transport. Donor side electron-transfer components to the RC include a putative RC-associated cytochrome c553 and a unique four-large-subunit cytochrome bc complex consisting of Rieske Fe-S protein (encoded by petC), cytochrome b6 (petB), subunit IV (petD), and a diheme cytochrome c (petX). Phylogenetic analysis of various photosynthesis gene products indicates a consistent grouping of oxygenic lineages that are distinct and descendent from anoxygenic lineages. In addition, H. mobilis was placed as the closest relative to cyanobacteria, which form a monophyletic origin to chloroplast-based photosynthetic lineages. The consensus of the photosynthesis gene trees also indicates that purple bacteria are the earliest emerging photosynthetic lineage. Our analysis also indicates that an ancient gene-duplication event giving rise to the paralogous bchI and bchD genes predates the divergence of all photosynthetic groups. In addition, our analysis of gene duplication of the photosystem I and photosystem II core polypeptides supports a “heterologous fusion model” for the origin and evolution of oxygenic photosynthesis.
Resumo:
We report here that wild-type Escherichia coli can grow on the chitin disaccharide, N,N′-diacetylchitobiose (GlcNAc)2, as the sole source of carbon. Transposon mutants were isolated that were unable to ferment (GlcNAc)2 but grew normally on the monosaccharide GlcNAc. One such mutant was used to screen a wild-type E. coli genomic cosmid library for restoration of (GlcNAc)2 fermentation. A partial sequence analysis of the isolated fragment mapped the clone to the (previously sequenced) E. coli genome between 39.0 and 39.2 min. The nucleotide ORFs at this region had been previously assigned to code for a “cryptic” cellobiose utilization (cel) operon. We report here, however, that functional analysis of the operon, including growth and chemotaxis, reveal that it encodes a set of proteins that are not cryptic, but are induced by (GlcNAc)2 and catabolize the disaccharide. We therefore propose to rename the cel operon as the chb (N,N′-diacetylchitobiose) operon, with the letter designation of the genes of the operon to be reassigned consistent with the nomenclature based on functional characterization of the gene products as follows: celA to chbB, celB to chbC, celC to chbA, celD to chbR, and celF to chbF. Furthermore, sequencing evidence indicates that the operon contains an additional gene of unknown function to be designated as chbG. Thus, the overall gene sequence is to be named chbBCARFG.
Resumo:
Genes for σ-like factors of bacterial-type RNA polymerase have not been characterized from any multicellular eukaryotes, although they probably play a crucial role in the expression of plastid photosynthesis genes. We have cloned three distinct cDNAs, designated SIG1, SIG2, and SIG3, for polypeptides possessing amino acid sequences for domains conserved in σ70 factors of bacterial RNA polymerases from the higher plant Arabidopsis thaliana. Each gene is present as one copy per haploid genome without any additional sequences hybridized in the genome. Transient expression assays using green fluorescent protein demonstrated that N-terminal regions of the SIG2 and SIG3 ORFs could function as transit peptides for import into chloroplasts. Transcripts for all three SIG genes were detected in leaves but not in roots, and were induced in leaves of dark-adapted plants in rapid response to light illumination. Together with results of our previous analysis of tissue-specific regulation of transcription of plastid photosynthesis genes, these results indicate that expressed levels of the genes may influence transcription by regulating RNA polymerase activity in a green tissue-specific manner.
Resumo:
cagA, a gene that codes for an immunodominant antigen, is present only in Helicobacter pylori strains that are associated with severe forms of gastroduodenal disease (type I strains). We found that the genetic locus that contains cagA (cag) is part of a 40-kb DNA insertion that likely was acquired horizontally and integrated into the chromosomal glutamate racemase gene. This pathogenicity island is flanked by direct repeats of 31 bp. In some strains, cag is split into a right segment (cagI) and a left segment (cagII) by a novel insertion sequence (IS605). In a minority of H. pylori strains, cagI and cagII are separated by an intervening chromosomal sequence. Nucleotide sequencing of the 23,508 base pairs that form the cagI region and the extreme 3′ end of the cagII region reveals the presence of 19 ORFs that code for proteins predicted to be mostly membrane associated with one gene (cagE), which is similar to the toxin-secretion gene of Bordetella pertussis, ptlC, and the transport systems required for plasmid transfer, including the virB4 gene of Agrobacterium tumefaciens. Transposon inactivation of several of the cagI genes abolishes induction of IL-8 expression in gastric epithelial cell lines. Thus, we believe the cag region may encode a novel H. pylori secretion system for the export of virulence determinants.
Resumo:
The genome of the Kaposi sarcoma-associated herpesvirus (KSHV or HHV8) was mapped with cosmid and phage genomic libraries from the BC-1 cell line. Its nucleotide sequence was determined except for a 3-kb region at the right end of the genome that was refractory to cloning. The BC-1 KSHV genome consists of a 140.5-kb-long unique coding region flanked by multiple G+C-rich 801-bp terminal repeat sequences. A genomic duplication that apparently arose in the parental tumor is present in this cell culture-derived strain. At least 81 ORFs, including 66 with homology to herpesvirus saimiri ORFs, and 5 internal repeat regions are present in the long unique region. The virus encodes homologs to complement-binding proteins, three cytokines (two macrophage inflammatory proteins and interleukin 6), dihydrofolate reductase, bcl-2, interferon regulatory factors, interleukin 8 receptor, neural cell adhesion molecule-like adhesin, and a D-type cyclin, as well as viral structural and metabolic proteins. Terminal repeat analysis of virus DNA from a KS lesion suggests a monoclonal expansion of KSHV in the KS tumor.
Resumo:
We introduce a method of functionally classifying genes by using gene expression data from DNA microarray hybridization experiments. The method is based on the theory of support vector machines (SVMs). SVMs are considered a supervised computer learning method because they exploit prior knowledge of gene function to identify unknown genes of similar function from expression data. SVMs avoid several problems associated with unsupervised clustering methods, such as hierarchical clustering and self-organizing maps. SVMs have many mathematical features that make them attractive for gene expression analysis, including their flexibility in choosing a similarity function, sparseness of solution when dealing with large data sets, the ability to handle large feature spaces, and the ability to identify outliers. We test several SVMs that use different similarity metrics, as well as some other supervised learning methods, and find that the SVMs best identify sets of genes with a common function using expression data. Finally, we use SVMs to predict functional roles for uncharacterized yeast ORFs based on their expression data.
Resumo:
Poxviruses employ many strategies to evade and neutralize the host immune response. In this study, we have identified two vaccinia virus ORFs, termed A46R and A52R, that share amino acid sequence similarity with the Toll/IL-1 receptor (TIR) domain, a motif that defines the IL-1/Toll-like receptor (TLR) superfamily of receptors, which have a key role in innate immunity and inflammation. When expressed in mammalian cells, the protein products of both ORFs were shown to interfere specifically with IL-1 signal transduction. A46R partially inhibited IL-1-mediated activation of the transcription factor NFκB, and A52R potently blocked both IL-1- and TLR4-mediated NFκB activation. MyD88 is a TIR domain-containing adapter molecule known to have a central role in both IL-1 and TLR4 signaling. A52R mimicked the dominant-negative effect of a truncated version of MyD88 on IL-1, TLR4, and IL-18 signaling but had no effect on MyD88-independent signaling pathways. Therefore, A46R and A52R are likely to represent a mechanism used by vaccinia virus of suppressing TIR domain-dependent intracellular signaling.
Resumo:
Using the representation difference analysis technique, we have identified a novel gene, Ian4, which is preferentially expressed in hematopoietic precursor 32D cells transfected with wild-type versus mutant forms of the Bcr/Abl oncogene. Ian4 expression was undetectable in 32D cells transfected with v-src, oncogenic Ha-ras or v-Abl. Murine Ian4 maps to chromosome 6, 25 cM from the centromere. The Ian4 mRNA contains two open reading frames (ORFs) separated by 5 nt. The first ORF has the potential to encode for a polypeptide of 67 amino acids without apparent homology to known proteins. The second ORF encodes a protein of 301 amino acids with a GTP/ATP-binding site in the N-terminus and a hydrophobic domain in the extreme C-terminus. The IAN-4 protein resides in the mitochondrial outer membrane and the last 20 amino acids are necessary for this localization. The IAN-4 protein has GTP-binding activity and shares sequence homology with a novel family of putative GTP-binding proteins: the immuno-associated nucleotide (IAN) family.
Resumo:
VIDA is a new virus database that organizes open reading frames (ORFs) from partial and complete genomic sequences from animal viruses. Currently VIDA includes all sequences from GenBank for Herpesviridae, Coronaviridae and Arteriviridae. The ORFs are organized into homologous protein families, which are identified on the basis of sequence similarity relationships. Conserved sequence regions of potential functional importance are identified and can be retrieved as sequence alignments. We use a controlled taxonomical and functional classification for all the proteins and protein families in the database. When available, protein structures that are related to the families have also been included. The database is available for online search and sequence information retrieval at http://www.biochem.ucl.ac.uk/bsm/virus_database/VIDA.html.
Resumo:
Apicomplexan parasites such as Toxoplasma gondii contain a primitive plastid, the apicoplast, whose genome consists of a 35-kb circular DNA related to the plastid DNA of plants. Plants synthesize fatty acids in their plastids. The first committed step in fatty acid synthesis is catalyzed by acetyl-CoA carboxylase (ACC). This enzyme is encoded in the nucleus, synthesized in the cytosol, and transported into the plastid. In the present work, two genes encoding ACC from T. gondii were cloned and the gene structure was determined. Both ORFs encode multidomain proteins, each with an N-terminal extension, compared with the cytosolic ACCs from plants. The N-terminal extension of one isozyme, ACC1, was shown to target green fluorescent protein to the apicoplast of T. gondii. In addition, the apicoplast contains a biotinylated protein, consistent with the assertion that ACC1 is localized there. The second ACC in T. gondii appears to be cytosolic. T. gondii mitochondria also contain a biotinylated protein, probably pyruvate carboxylase. These results confirm the essential nature of the apicoplast and explain the inhibition of parasite growth in cultured cells by herbicides targeting ACC.
Resumo:
We have undertaken an extensive screen to identify Saccharomyces cerevisiae genes whose products are involved in cell cycle progression. We report the identification of 113 genes, including 19 hypothetical ORFs, which confer arrest or delay in specific compartments of the cell cycle when overexpressed. The collection of genes identified by this screen overlaps with those identified in loss-of-function cdc screens but also includes genes whose products have not previously been implicated in cell cycle control. Through analysis of strains lacking these hypothetical ORFs, we have identified a variety of new CDC and checkpoint genes.
Resumo:
We describe a method to screen pools of DNA from multiple transposon lines for insertions in many genes simultaneously. We use thermal asymmetric interlaced–PCR, a hemispecific PCR amplification protocol that combines nested, insertion-specific primers with degenerate primers, to amplify DNA flanking the transposons. In reconstruction experiments with previously characterized Arabidopsis lines carrying insertions of the maize Dissociation (Ds) transposon, we show that fluorescently labeled, transposon-flanking fragments overlapping ORFs hybridize to cognate expressed sequence tags (ESTs) on a DNA microarray. We further show that insertions can be detected in DNA pools from as many as 100 plants representing different transposon lines and that all of the tested, transposon-disrupted genes whose flanking fragments can be amplified individually also can be detected when amplified from the pool. The ability of a transposon-flanking fragment to hybridize declines rapidly with decreasing homology to the spotted DNA fragment, so that only ESTs with >90% homology to the transposon-disrupted gene exhibit significant cross-hybridization. Because thermal asymmetric interlaced–PCR fragments tend to be short, use of the present method favors recovery of insertions in and near genes. We apply the technique to screening pools of new Ds lines using cDNA microarrays containing ESTs for ≈1,000 stress-induced and -repressed Arabidopsis genes.
Resumo:
Streptomyces lavendulae produces complestatin, a cyclic peptide natural product that antagonizes pharmacologically relevant protein–protein interactions including formation of the C4b,2b complex in the complement cascade and gp120-CD4 binding in the HIV life cycle. Complestatin, a member of the vancomycin group of natural products, consists of an α-ketoacyl hexapeptide backbone modified by oxidative phenolic couplings and halogenations. The entire complestatin biosynthetic and regulatory gene cluster spanning ca. 50 kb was cloned and sequenced. It consisted of 16 ORFs, encoding proteins homologous to nonribosomal peptide synthetases, cytochrome P450-related oxidases, ferredoxins, nonheme halogenases, four enzymes involved in 4-hydroxyphenylglycine (Hpg) biosynthesis, transcriptional regulators, and ABC transporters. The nonribosomal peptide synthetase consisted of a priming module, six extending modules, and a terminal thioesterase; their arrangement and domain content was entirely consistent with functions required for the biosynthesis of a heptapeptide or α-ketoacyl hexapeptide backbone. Two oxidase genes were proposed to be responsible for the construction of the unique aryl-ether-aryl-aryl linkage on the linear heptapeptide intermediate. Hpg, 3,5-dichloro-Hpg, and 3,5-dichloro-hydroxybenzoylformate are unusual building blocks that repesent five of the seven requisite monomers in the complestatin peptide. Heterologous expression and biochemical analysis of 4-hydroxyphenylglycine transaminon confirmed its role as an aminotransferase responsible for formation of all three precursors. The close similarity but functional divergence between complestatin and chloroeremomycin biosynthetic genes also presents a unique opportunity for the construction of hybrid vancomycin-type antibiotics.
Resumo:
The complete nucleotide sequence, 5178 bp, of the totivirus Helminthosporium vicotoriae 190S virus (Hv190SV) double-stranded RNA, was determined. Computer-assisted sequence analysis revealed the presence of two large overlapping ORFs; the 5'-proximal large ORF (ORF1) codes for the coat protein (CP) with a predicted molecular mass of 81 kDa, and the 3'-proximal ORF (ORF2), which is in the -1 frame relative to ORF1, codes for an RNA-dependent RNA polymerase (RDRP). Unlike many other totiviruses, the overlap region between ORF1 and ORF2 lacks known structural information required for translational frameshifting. Using an antiserum to a C-terminal fragment of the RDRP, the product of ORF2 was identified as a minor virion-associated polypeptide of estimated molecular mass of 92 kDa. No CP-RDRP fusion protein with calculated molecular mass of 165 kDa was detected. The predicted start codon of the RDRP ORF (2605-AUG-2607) overlaps with the stop codon (2606-UGA-2608) of the CP ORF, suggesting RDRP is expressed by an internal initiation mechanism. Hv190SV is associated with a debilitating disease of its phytopathogenic fungal host. Knowledge of its genome organization and expression will be valuable for understanding its role in pathogenesis and for potential exploitation in the development of biocontrol measures.
Resumo:
The antimycobacterial compound ethambutol [Emb; dextro-2,2'-(ethylenediimino)-di-1-butanol] is used to treat tuberculosis as well as disseminated infections caused by Mycobacterium avium. The critical target for Emb lies in the pathway for the biosynthesis of cell wall arabinogalactan, but the molecular mechanisms for drug action and resistance are unknown. The cellular target for Emb was sought using drug resistance, via target overexpression by a plasmid vector, as a selection tool. This strategy led to the cloning of the M. avium emb region which rendered the otherwise susceptible Mycobacterium smegmatis host resistant to Emb. This region contains three complete open reading frames (ORFs), embR, embA, and embB. The translationally coupled embA and embB genes are necessary and sufficient for an Emb-resistant phenotype which depends on gene copy number, and their putative novel membrane proteins are homologous to each other. The predicted protein encoded by embR, which is related to known transcriptional activators from Streptomyces, is expendable for the phenotypic expression of Emb resistance, but an intact divergent promoter region between embR and embAB is required. An Emb-sensitive cell-free assay for arabinan biosynthesis shows that overexpression of embAB is associated with high-level Emb-resistant arabinosyl transferase activity, and that embR appears to modulate the in vitro level of this activity. These data suggest that embAB encode the drug target of Emb, the arabinosyl transferase responsible for the polymerization of arabinose into the arabinan of arabinogalactan, and that overproduction of this Emb-sensitive target leads to Emb resistance.