968 resultados para Protein Sequence Analysis


Relevância:

90.00% 90.00%

Publicador:

Resumo:

Sequence analysis of chloroplast and mitochondrial large subunit rRNA genes from over 75 green algae disclosed 28 new group I intron-encoded proteins carrying a single LAGLIDADG motif. These putative homing endonucleases form four subfamilies of homologous enzymes, with the members of each subfamily being encoded by introns sharing the same insertion site. We showed that four divergent endonucleases from the I-CreI subfamily cleave the same DNA substrates. Mapping of the 66 amino acids that are conserved among the members of this subfamily on the 3-dimensional structure of I-CreI bound to its recognition sequence revealed that these residues participate in protein folding, homodimerization, DNA recognition and catalysis. Surprisingly, only seven of the 21 I-CreI amino acids interacting with DNA are conserved, suggesting that I-CreI and its homologs use different subsets of residues to recognize the same DNA sequence. Our sequence comparison of all 45 single-LAGLIDADG proteins identified so far suggests that these proteins share related structures and that there is a weak pressure in each subfamily to maintain identical protein–DNA contacts. The high sequence variability we observed in the DNA-binding site of homologous LAGLIDADG endonucleases provides insight into how these proteins evolve new DNA specificity.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

In order to support the structural genomic initiatives, both by rapidly classifying newly determined structures and by suggesting suitable targets for structure determination, we have recently developed several new protocols for classifying structures in the CATH domain database (http://www.biochem.ucl.ac.uk/bsm/cath). These aim to increase the speed of classification of new structures using fast algorithms for structure comparison (GRATH) and to improve the sensitivity in recognising distant structural relatives by incorporating sequence information from relatives in the genomes (DomainFinder). In order to ensure the integrity of the database given the expected increase in data, the CATH Protein Family Database (CATH-PFDB), which currently includes 25 320 structural domains and a further 160 000 sequence relatives has now been installed in a relational ORACLE database. This was essential for developing more rigorous validation procedures and for allowing efficient querying of the database, particularly for genome analysis. The associated Dictionary of Homologous Superfamilies [Bray,J.E., Todd,A.E., Pearl,F.M.G., Thornton,J.M. and Orengo,C.A. (2000) Protein Eng., 13, 153–165], which provides multiple structural alignments and functional information to assist in assigning new relatives, has also been expanded recently and now includes information for 903 homo­logous superfamilies. In order to improve coverage of known structures, preliminary classification levels are now provided for new structures at interim stages in the classification protocol. Since a large proportion of new structures can be rapidly classified using profile-based sequence analysis [e.g. PSI-BLAST: Altschul,S.F., Madden,T.L., Schaffer,A.A., Zhang,J., Zhang,Z., Miller,W. and Lipman,D.J. (1997) Nucleic Acids Res., 25, 3389–3402], this provides preliminary classification for easily recognisable homologues, which in the latest release of CATH (version 1.7) represented nearly three-quarters of the non-identical structures.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

While genome sequencing projects are advancing rapidly, EST sequencing and analysis remains a primary research tool for the identification and categorization of gene sequences in a wide variety of species and an important resource for annotation of genomic sequence. The TIGR Gene Indices (http://www.tigr.org/tdb/tgi.shtml) are a collection of species-specific databases that use a highly refined protocol to analyze EST sequences in an attempt to identify the genes represented by that data and to provide additional information regarding those genes. Gene Indices are constructed by first clustering, then assembling EST and annotated gene sequences from GenBank for the targeted species. This process produces a set of unique, high-fidelity virtual transcripts, or Tentative Consensus (TC) sequences. The TC sequences can be used to provide putative genes with functional annotation, to link the transcripts to mapping and genomic sequence data, to provide links between orthologous and paralogous genes and as a resource for comparative sequence analysis.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The iProClass database is an integrated resource that provides comprehensive family relationships and structural and functional features of proteins, with rich links to various databases. It is extended from ProClass, a protein family database that integrates PIR superfamilies and PROSITE motifs. The iProClass currently consists of more than 200 000 non-redundant PIR and SWISS-PROT proteins organized with more than 28 000 superfamilies, 2600 domains, 1300 motifs, 280 post-translational modification sites and links to more than 30 databases of protein families, structures, functions, genes, genomes, literature and taxonomy. Protein and family summary reports provide rich annotations, including membership information with length, taxonomy and keyword statistics, full family relationships, comprehensive enzyme and PDB cross-references and graphical feature display. The database facilitates classification-driven annotation for protein sequence databases and complete genomes, and supports structural and functional genomic research. The iProClass is implemented in Oracle 8i object-relational system and available for sequence search and report retrieval at http://pir.georgetow n.edu/iproclass/.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The K homology (KH) module is a widespread RNA-binding motif that has been detected by sequence similarity searches in such proteins as heterogeneous nuclear ribonucleoprotein K (hnRNP K) and ribosomal protein S3. Analysis of spatial structures of KH domains in hnRNP K and S3 reveals that they are topologically dissimilar and thus belong to different protein folds. Thus KH motif proteins provide a rare example of protein domains that share significant sequence similarity in the motif regions but possess globally distinct structures. The two distinct topologies might have arisen from an ancestral KH motif protein by N- and C-terminal extensions, or one of the existing topologies may have evolved from the other by extension, displacement and deletion. C-terminal extension (deletion) requires β-sheet rearrangement through the insertion (removal) of a β-strand in a manner similar to that observed in serine protease inhibitors serpins. Current analysis offers a new look on how proteins can change fold in the course of evolution.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The multispanning membrane protein Ste6, a member of the ABC-transporter family, is transported to the yeast vacuole for degradation. To identify functions involved in the intracellular trafficking of polytopic membrane proteins, we looked for functions that block Ste6 transport to the vacuole upon overproduction. In our screen, we identified several known vacuolar protein sorting (VPS) genes (SNF7/VPS32, VPS4, and VPS35) and a previously uncharacterized open reading frame, which we named MOS10 (more of Ste6). Sequence analysis showed that Mos10 is a member of a small family of coiled-coil–forming proteins, which includes Snf7 and Vps20. Deletion mutants of all three genes stabilize Ste6 and show a “class E vps phenotype.” Maturation of the vacuolar hydrolase carboxypeptidase Y was affected in the mutants and the endocytic tracer FM4-64 and Ste6 accumulated in a dot or ring-like structure next to the vacuole. Differential centrifugation experiments demonstrated that about half of the hydrophilic proteins Mos10 and Vps20 was membrane associated. The intracellular distribution was further analyzed for Mos10. On sucrose gradients, membrane-associated Mos10 cofractionated with the endosomal t-SNARE Pep12, pointing to an endosomal localization of Mos10. The growth phenotypes of the mutants suggest that the “Snf7-family” members are involved in a cargo-specific event.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Efficient motility of the eukaryotic flagellum requires precise temporal and spatial control of its constituent dynein motors. The central pair and its associated structures have been implicated as important members of a signal transduction cascade that ultimately regulates dynein arm activity. To identify central pair components involved in this process, we characterized a Chlamydomonas motility mutant (pf6-2) obtained by insertional mutagenesis. pf6-2 flagella twitch ineffectively and lack the 1a projection on the C1 microtubule of the central pair. Transformation with constructs containing a full-length, wild-type copy of the PF6 gene rescues the functional, structural, and biochemical defects associated with the pf6 mutation. Sequence analysis indicates that the PF6 gene encodes a large polypeptide that contains numerous alanine-rich, proline-rich, and basic domains and has limited homology to an expressed sequence tag derived from a human testis cDNA library. Biochemical analysis of an epitope-tagged PF6 construct demonstrates that the PF6 polypeptide is an axonemal component that cosediments at 12.6S with several other polypeptides. The PF6 protein appears to be an essential component required for assembly of some of these polypeptides into the C1-1a projection.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The intracellular pathogen Trypanosoma cruzi is the etiological agent of Chagas’ disease. We have isolated a full-length cDNA encoding uracil-DNA glycosylase (UDGase), a key enzyme involved in DNA repair, from this organism. The deduced protein sequence is highly conserved at the C-terminus of the molecule and shares key residues involved in binding or catalysis with most of the UDGases described so far, while the N-terminal part is highly variable. The gene is single copy and is located on a chromosome of ∼1.9 Mb. A His-tagged recombinant protein was overexpressed, purified and used to raise polyclonal antibodies. Western blot analysis revealed the existence of a single UDGase species in parasite extracts. Using a specific ethidium bromide fluorescence assay, recombinant T.cruzi UDGase was shown to specifically excise uracil from DNA. The addition of both Leishmania major AP endonuclease and exonuclease III, the major AP endonuclease from Escherichia coli, produces stimulation of UDGase activity. This activation is specific for AP endonuclease and suggests functional communication between the two enzymes.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Detection of similarity is particularly difficult for small proteins and thus connections between many of them remain unnoticed. Structure and sequence analysis of several metal-binding proteins reveals unexpected similarities in structural domains classified as different protein folds in SCOP and suggests unification of seven folds that belong to two protein classes. The common motif, termed treble clef finger in this study, forms the protein structural core and is 25–45 residues long. The treble clef motif is assembled around the central zinc ion and consists of a zinc knuckle, loop, β-hairpin and an α-helix. The knuckle and the first turn of the helix each incorporate two zinc ligands. Treble clef domains constitute the core of many structures such as ribosomal proteins L24E and S14, RING fingers, protein kinase cysteine-rich domains, nuclear receptor-like fingers, LIM domains, phosphatidylinositol-3-phosphate-binding domains and His-Me finger endonucleases. The treble clef finger is a uniquely versatile motif adaptable for various functions. This small domain with a 25 residue structural core can accommodate eight different metal-binding sites and can have many types of functions from binding of nucleic acids, proteins and small molecules, to catalysis of phosphodiester bond hydrolysis. Treble clef motifs are frequently incorporated in larger structures or occur in doublets. Present analysis suggests that the treble clef motif defines a distinct structural fold found in proteins with diverse functional properties and forms one of the major zinc finger groups.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

We have cloned, expressed and purified a hexameric human DNA helicase (hHcsA) from HeLa cells. Sequence analysis demonstrated that the hHcsA has strong sequence homology with DNA helicase genes from Saccharomyces cerevisiae and Caenorhabditis elegans, indicating that this gene appears to be well conserved from yeast to human. The hHcsA gene was cloned and expressed in Escherichia coli and purified to homogeneity. The expressed protein had a subunit molecular mass of 116 kDa and analysis of its native molecular mass by size exclusion chromatography suggested that hHcsA is a hexameric protein. The hHcsA protein had a strong DNA-dependent ATPase activity that was stimulated ≥5-fold by single-stranded DNA (ssDNA). Human hHcsA unwinds duplex DNA and analysis of the polarity of translocation demonstrated that the polarity of DNA unwinding was in a 5′→3′ direction. The helicase activity was stimulated by human and yeast replication protein A, but not significantly by E.coli ssDNA-binding protein. We have analyzed expression levels of the hHcsA gene in HeLa cells during various phases of the cell cycle using in situ hybridization analysis. Our results indicated that the expression of the hHcsA gene, as evidenced from the mRNA levels, is cell cycle-dependent. The maximal level of hHcsA expression was observed in late G1/early S phase, suggesting a possible role for this protein during S phase and in DNA synthesis.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Fusicoccin (FC) is a fungal toxin that activates the plant plasma membrane H+-ATPase by binding with 14-3-3 proteins, causing membrane hyperpolarization. Here we report on the effect of FC on a gene-for-gene pathogen-resistance response and show that FC application induces the expression of several genes involved in plant responses to pathogens. Ten members of the FC-binding 14-3-3 protein gene family were isolated from tomato (Lycopersicon esculentum) to characterize their role in defense responses. Sequence analysis is suggestive of common biochemical functions for these tomato 14-3-3 proteins, but their genes showed different expression patterns in leaves after challenges. Different specific subsets of 14-3-3 genes were induced after treatment with FC and during a gene-for-gene resistance response. Possible roles for the H+-ATPase and 14-3-3 proteins in responses to pathogens are discussed.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The accumulation of the disaccharide trehalose in anhydrobiotic organisms allows them to survive severe environmental stress. A plant cDNA, SlTPS1, encoding a 109-kD protein, was isolated from the resurrection plant Selaginella lepidophylla, which accumulates high levels of trehalose. Protein-sequence comparison showed that SlTPS1 shares high similarity to trehalose-6-phosphate synthase genes from prokaryotes and eukaryotes. SlTPS1 mRNA was constitutively expressed in S. lepidophylla. DNA gel-blot analysis indicated that SlTPS1 is present as a single-copy gene. Transformation of a Saccharomyces cerevisiae tps1Δ mutant disrupted in the ScTPS1 gene with S. lepidophylla SlTPS1 restored growth on fermentable sugars and the synthesis of trehalose at high levels. Moreover, the SlTPS1 gene introduced into the tps1Δ mutant was able to complement both deficiencies: sensitivity to sublethal heat treatment at 39°C and induced thermotolerance at 50°C. The osmosensitive phenotype of the yeast tps1Δ mutant grown in NaCl and sorbitol was also restored by the SlTPS1 gene. Thus, SlTPS1 protein is a functional plant homolog capable of sustaining trehalose biosynthesis and could play a major role in stress tolerance in S. lepidophylla.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

SINE (short interspersed element) insertion analysis elucidates contentious aspects in the phylogeny of toothed whales and dolphins (Odontoceti), especially river dolphins. Here, we characterize 25 informative SINEs inserted into unique genomic loci during evolution of odontocetes to construct a cladogram, and determine a total of 2.8 kb per taxon of the flanking sequences of these SINE loci to estimate divergence times among lineages. We demonstrate that: (i) Odontocetes are monophyletic; (ii) Ganges River dolphins, beaked whales, and ocean dolphins diverged (in this order) after sperm whales; (iii) three other river dolphin taxa, namely the Amazon, La Plata, and Yangtze river dolphins, form a monophyletic group with Yangtze River dolphins being the most basal; and (iv) the rapid radiation of extant cetacean lineages occurred some 28–33 million years B.P., in strong accord with the fossil record. The combination of SINE and flanking sequence analysis suggests a topology and set of divergence times for odontocete relationships, offering alternative explanations for several long-standing problems in cetacean evolution.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Species of pathogenic microbes are composed of an array of evolutionarily distinct chromosomal genotypes characterized by diversity in gene content and sequence (allelic variation). The occurrence of substantial genetic diversity has hindered progress in developing a comprehensive understanding of the molecular basis of virulence and new therapeutics such as vaccines. To provide new information that bears on these issues, 11 genes encoding extracellular proteins in the human bacterial pathogen group A Streptococcus identified by analysis of four genomes were studied. Eight of the 11 genes encode proteins with a LPXTG(L) motif that covalently links Gram-positive virulence factors to the bacterial cell surface. Sequence analysis of the 11 genes in 37 geographically and phylogenetically diverse group A Streptococcus strains cultured from patients with different infection types found that recent horizontal gene transfer has contributed substantially to chromosomal diversity. Regions of the inferred proteins likely to interact with the host were identified by molecular population genetic analysis, and Western immunoblot analysis with sera from infected patients confirmed that they were antigenic. Real-time reverse transcriptase–PCR (TaqMan) assays found that transcription of six of the 11 genes was substantially up-regulated in the stationary phase. In addition, transcription of many genes was influenced by the covR and mga trans-acting gene regulatory loci. Multilocus investigation of putative virulence genes by the integrated approach described herein provides an important strategy to aid microbial pathogenesis research and rapidly identify new targets for therapeutics research.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Lipoic acid is a coenzyme that is essential for the activity of enzyme complexes such as those of pyruvate dehydrogenase and glycine decarboxylase. We report here the isolation and characterization of LIP1 cDNA for lipoic acid synthase of Arabidopsis. The Arabidopsis LIP1 cDNA was isolated using an expressed sequence tag homologous to the lipoic acid synthase of Escherichia coli. This cDNA was shown to code for Arabidopsis lipoic acid synthase by its ability to complement a lipA mutant of E. coli defective in lipoic acid synthase. DNA-sequence analysis of the LIP1 cDNA revealed an open reading frame predicting a protein of 374 amino acids. Comparisons of the deduced amino acid sequence with those of E. coli and yeast lipoic acid synthase homologs showed a high degree of sequence similarity and the presence of a leader sequence presumably required for import into the mitochondria. Southern-hybridization analysis suggested that LIP1 is a single-copy gene in Arabidopsis. Western analysis with an antibody against lipoic acid synthase demonstrated that this enzyme is located in the mitochondrial compartment in Arabidopsis cells as a 43-kD polypeptide.