8 resultados para Orfs
em Indian Institute of Science - Bangalore - Índia
Resumo:
Background: The Mycobacterium leprae genome has less than 50% coding capacity and 1,133 pseudogenes. Preliminary evidence suggests that some pseudogenes are expressed. Therefore, defining pseudogene transcriptional and translational potentials of this genome should increase our understanding of their impact on M. leprae physiology. Results: Gene expression analysis identified transcripts from 49% of all M. leprae genes including 57% of all ORFs and 43% of all pseudogenes in the genome. Transcribed pseudogenes were randomly distributed throughout the chromosome. Factors resulting in pseudogene transcription included: 1) co-orientation of transcribed pseudogenes with transcribed ORFs within or exclusive of operon-like structures; 2) the paucity of intrinsic stem-loop transcriptional terminators between transcribed ORFs and downstream pseudogenes; and 3) predicted pseudogene promoters. Mechanisms for translational ``silencing'' of pseudogene transcripts included the lack of both translational start codons and strong Shine-Dalgarno (SD) sequences. Transcribed pseudogenes also contained multiple ``in-frame'' stop codons and high Ka/Ks ratios, compared to that of homologs in M. tuberculosis and ORFs in M. leprae. A pseudogene transcript containing an active promoter, strong SD site, a start codon, but containing two in frame stop codons yielded a protein product when expressed in E. coli. Conclusion: Approximately half of M. leprae's transcriptome consists of inactive gene products consuming energy and resources without potential benefit to M. leprae. Presently it is unclear what additional detrimental affect(s) this large number of inactive mRNAs has on the functional capability of this organism. Translation of these pseudogenes may play an important role in overall energy consumption and resultant pathophysiological characteristics of M. leprae. However, this study also demonstrated that multiple translational ``silencing'' mechanisms are present, reducing additional energy and resource expenditure required for protein production from the vast majority of these transcripts.
Resumo:
Genome sequence information has generated increasing evidence for the claim that repetitive DNA sequences present within and around genes could play a important role in the regulation of gene expression. Polypurine/polypyrimidine sequences [poly(Pu/Py)] have been observed in the vicinity of promoters and within the transcribed regions of many genes. To understand whether such sequences influence the level of gene expression, we constructed several prokaryotic and eukaryotic expression vectors incorporating poly(Pu/Py) repeats both within and upstream of a reporter gene, lacZ (encoding β-galactosidase), and studied its expression in vivo. We find that, in contrast to the situation in Escherichia coli, the presence of poly(Pu/Py) sequences within the gene does not significantly inhibit gene expression in mammalian cells. On the other hand, the presence of such sequences upstream of lacZ leads to a several-fold reduction of gene expression in mammalian cells. Similar down-regulation was observed when a structural cassette containing poly(Pu/Py) sequences upstream of lacZ was integrated into yeast chromosome V. Sequence analysis of the nine totally sequenced yeast chromosomes shows that a large number of such sequences occur upstream of ORFs. On the basis of our experimental results and DNA sequence analysis, we propose that these sequences can function as cis-acting transcriptional regulators.
Resumo:
The complete genome of the baker's yeast S. cerevisiae was analyzed for the presence of polypurine/polypyrimidine (poly[pu/py]) repeats and their occurrences were classified on the basis of their location within and outside open reading frames (ORFs). The analysis reveals that such sequence motifs are present abundantly both in coding as well as noncoding regions. Clear positional preferences are seen when these tracts occur in noncoding regions. These motifs appear to occur predominantly at a unit nucleosomal length both upstream and downstream of ORFs. Moreover, there is a biased distribution of polypurines in the coding strands when these motifs occur within open reading frames. The significance of the biased distribution is discussed with reference to the occurrence of these motifs in other known mRNA sequences and expressed sequence tags. A model for cis regulation of gene expression is proposed based on the ability of these motifs to form an intermolecular triple helix structure when present within the coding region and/or to modulate nucleosome positioning via enhanced histone affinity when present outside coding regions.
Resumo:
Background: HU a small, basic, histone like protein is a major component of the bacterial nucleoid. E. coli has two subunits of HU coded by hupA and hupB genes whereas Mycobacterium tuberculosis (Mtb) has only one subunit of HU coded by ORF Rv2986c (hupB gene). One noticeable feature regarding Mtb HupB, based on sequence alignment of HU orthologs from different bacteria, was that HupB(Mtb) bears at its C-terminal end, a highly basic extension and this prompted an examination of its role in Mtb HupB function. Methodology/Principal Findings: With this objective two clones of Mtb HupB were generated; one expressing full length HupB protein (HupB(Mtb)) and another which expresses only the N terminal region (first 95 amino acid) of hupB (HupB(MtbN)). Gel retardation assays revealed that HupBMtbN is almost like E. coli HU (heat stable nucleoid protein) in terms of its DNA binding, with a binding constant (K-d) for linear dsDNA greater than 1000 nM, a value comparable to that obtained for the HU alpha alpha and HU alpha beta forms. However CTR (C-terminal Region) of HupB(Mtb) imparts greater specificity in DNA binding. HupB(Mtb) protein binds more strongly to supercoiled plasmid DNA than to linear DNA, also this binding is very stable as it provides DNase I protection even up to 5 minutes. Similar results were obtained when the abilities of both proteins to mediate protection against DNA strand cleavage by hydroxyl radicals generated by the Fenton's reaction, were compared. It was also observed that both the proteins have DNA binding preference for A: T rich DNA which may occur at the regulatory regions of ORFs and the oriC region of Mtb. Conclusions/Significance: These data thus point that HupB(Mtb) may participate in chromosome organization in-vivo, it may also play a passive, possibly an architectural role.
Resumo:
A cDNA clone isolated by differentially screening a cytokinin-induced haustorial cDNA library of Cuscuta reflexa was sequenced and identified as the gene coding for cytochrome b(5), based on the similarity of the deduced amino-acid sequence with that of the cauliflower (60% identity) and tobacco (78% identity) proteins. The 5'-UTR is unusually long (720 bp) and contains 14 potential start codons (ATG) and 10 short ORFs.
Resumo:
Of the similar to 4000 ORFs identified through the genome sequence of Mycobacterium tuberculosis (TB) H37Rv, experimentally determined structures are available for 312. Since knowledge of protein structures is essential to obtain a high-resolution understanding of the underlying biology, we seek to obtain a structural annotation for the genome, using computational methods. Structural models were obtained and validated for similar to 2877 ORFs, covering similar to 70% of the genome. Functional annotation of each protein was based on fold-based functional assignments and a novel binding site based ligand association. New algorithms for binding site detection and genome scale binding site comparison at the structural level, recently reported from the laboratory, were utilized. Besides these, the annotation covers detection of various sequence and sub-structural motifs and quaternary structure predictions based on the corresponding templates. The study provides an opportunity to obtain a global perspective of the fold distribution in the genome. The annotation indicates that cellular metabolism can be achieved with only 219 folds. New insights about the folds that predominate in the genome, as well as the fold-combinations that make up multi-domain proteins are also obtained. 1728 binding pockets have been associated with ligands through binding site identification and sub-structure similarity analyses. The resource (http://proline.physics.iisc.ernet.in/Tbstructuralannotation), being one of the first to be based on structure-derived functional annotations at a genome scale, is expected to be useful for better understanding of TB and for application in drug discovery. The reported annotation pipeline is fairly generic and can be applied to other genomes as well.
Resumo:
We report the draft genome sequence of methicillin-resistant Staphylococcus aureus (MRSA) strain ST672, an emerging disease clone in India, from a septicemia patient. The genome size is about 2.82 Mb with 2,485 open reading frames (ORFs). The staphylococcal cassette chromosome mec (SCCmec) element (type V) and immune evasion cluster appear to be different from those of strain ST772 on preliminary examination.
Resumo:
Escherichia coli-mycobacterium shuttle vectors are important tools for gene expression and gene replacement in mycobacteria. However, most of the currently available vectors are limited in their use because of the lack of extended multiple cloning sites (MCSs) and convenience of appending an epitope tag(s) to the cloned open reading frames (ORFs). Here we report a new series of vectors that allow for the constitutive and regulatable expression of proteins, appended with peptide tag sequences at their N and C termini, respectively. The applicability of these vectors is demonstrated by the constitutive and induced expression of the Mycobacterium tuberculosis pknK gene, coding for protein kinase K, a serine-threonine protein kinase. Furthermore, a suicide plasmid with expanded MCS for creating gene replacements, a plasmid for chromosomal integrations at the commonly used L5 attB site, and a hypoxia-responsive vector, for expression of a gene(s) under hypoxic conditions that mimic latency, have also been created. Additionally, we have created a vector for the coexpression of two proteins controlled by two independent promoters, with each protein being in fusion with a different tag. The shuttle vectors developed in the present study are excellent tools for the analysis of gene function in mycobacteria and are a valuable addition to the existing repertoire of vectors for mycobacterial research.