929 resultados para protein sequence classification
Resumo:
Ocular cicatricial pemphigoid (OCP) is an autoimmune disease that affects mainly conjunctiva and other squamous epithelia. OCP is histologically characterized by a separation of the epithelium from underlying tissues within the basement membrane zone. Immunopathological studies demonstrate the deposition of anti-basement membrane zone autoantibodies in vivo. Purified IgG from sera of patients with active OCP identified a cDNA clone from a human keratinocyte cDNA library that had complete homology with the cytoplasmic domain of β4-integrin. The sera recognized a 205-kDa protein in human epidermal, human conjunctiva, and tumor cell lysates that was identified as β4-integrin by its reaction with polyclonal and monoclonal antibodies to human β4-integrin. Sera from patients with bullous pemphigoid, pemphigus vulgaris, and cicatricial pemphigoid-like diseases did not recognize the 205-kDa protein, indicating the specificity of the binding. These data strongly implicate a role for human β4-integrin in the pathogenesis of OCP. It should be emphasized that multiple antigens in the basement membrane zone of squamous epithelia may serve as targets for a wide spectrum of autoantibodies observed in vesiculobullous diseases. Molecular definition of these autoantigens will facilitate the classification and characterization of subsets of cicatricial pemphigoid and help distinguishing them from bullous pemphigoid. This study highlights the function and importance of β4-integrin in maintaining the attachment of epithelial cells to the basement membrane.
Resumo:
IFNγ, once called the macrophage-activating factor, stimulates many genes in macrophages, ultimately leading to the elicitation of innate immunity. IFNγ's functions depend on the activation of STAT1, which stimulates transcription of IFNγ-inducible genes through the GAS element. The IFN consensus sequence binding protein (icsbγ or IFN regulatory factor 8), encoding a transcription factor of the IFN regulatory factor family, is one of such IFNγ-inducible genes in macrophages. We found that macrophages from ICSBP−/− mice were defective in inducing some IFNγ-responsive genes, even though they were capable of activating STAT1 in response to IFNγ. Accordingly, IFNγ activation of luciferase reporters fused to the GAS element was severely impaired in ICSBP−/− macrophages, but transfection of ICSBP resulted in marked stimulation of these reporters. Consistent with its role in activating IFNγ-responsive promoters, ICSBP stimulated reporter activity in a GAS-specific manner, even in the absence of IFNγ treatment, and in STAT1 negative cells. Indicative of a mechanism for this stimulation, DNA affinity binding assays revealed that endogenous ICSBP was recruited to a multiprotein complex that bound to GAS. These results suggest that ICSBP, when induced by IFNγ through STAT1, in turn generates a second wave of transcription from GAS-containing promoters, thereby contributing to the elicitation of IFNγ's unique activities in immune cells.
Resumo:
To test a different approach to understanding the relationship between the sequence of part of a protein and its conformation in the overall folded structure, the amino acid sequence corresponding to an α-helix of T4 lysozyme was duplicated in tandem. The presence of such a sequence repeat provides the protein with “choices” during folding. The mutant protein folds with almost wild-type stability, is active, and crystallizes in two different space groups, one isomorphous with wild type and the other with two molecules in the asymmetric unit. The fold of the mutant is essentially the same in all cases, showing that the inserted segment has a well-defined structure. More than half of the inserted residues are themselves helical and extend the helix present in the wild-type protein. Participation of additional duplicated residues in this helix would have required major disruption of the parent structure. The results clearly show that the residues within the duplicated sequence tend to maintain a helical conformation even though the packing interactions with the remainder of the protein are different from those of the original helix. It supports the hypothesis that the structures of individual α-helices are determined predominantly by the nature of the amino acids within the helix, rather than the structural environment provided by the rest of the protein.
Resumo:
In the last decade, two tools, one drawn from information theory and the other from artificial neural networks, have proven particularly useful in many different areas of sequence analysis. The work presented herein indicates that these two approaches can be joined in a general fashion to produce a very powerful search engine that is capable of locating members of a given nucleic acid sequence family in either local or global sequence searches. This program can, in turn, be queried for its definition of the motif under investigation, ranking each base in context for its contribution to membership in the motif family. In principle, the method used can be applied to any binding motif, including both DNA and RNA sequence families, given sufficient family size.
Resumo:
An additivity-based sequence to reactivity algorithm for the interaction of members of the Kazal family of protein inhibitors with six selected serine proteinases is described. Ten consensus variable contact positions in the inhibitor were identified, and the 19 possible variants at each of these positions were expressed. The free energies of interaction of these variants and the wild type were measured. For an additive system, this data set allows for the calculation of all possible sequences, subject to some restrictions. The algorithm was extensively tested. It is exceptionally fast so that all possible sequences can be predicted. The strongest, the most specific possible, and the least specific inhibitors were designed, and an evolutionary problem was solved.
Resumo:
In order to support the structural genomic initiatives, both by rapidly classifying newly determined structures and by suggesting suitable targets for structure determination, we have recently developed several new protocols for classifying structures in the CATH domain database (http://www.biochem.ucl.ac.uk/bsm/cath). These aim to increase the speed of classification of new structures using fast algorithms for structure comparison (GRATH) and to improve the sensitivity in recognising distant structural relatives by incorporating sequence information from relatives in the genomes (DomainFinder). In order to ensure the integrity of the database given the expected increase in data, the CATH Protein Family Database (CATH-PFDB), which currently includes 25 320 structural domains and a further 160 000 sequence relatives has now been installed in a relational ORACLE database. This was essential for developing more rigorous validation procedures and for allowing efficient querying of the database, particularly for genome analysis. The associated Dictionary of Homologous Superfamilies [Bray,J.E., Todd,A.E., Pearl,F.M.G., Thornton,J.M. and Orengo,C.A. (2000) Protein Eng., 13, 153–165], which provides multiple structural alignments and functional information to assist in assigning new relatives, has also been expanded recently and now includes information for 903 homologous superfamilies. In order to improve coverage of known structures, preliminary classification levels are now provided for new structures at interim stages in the classification protocol. Since a large proportion of new structures can be rapidly classified using profile-based sequence analysis [e.g. PSI-BLAST: Altschul,S.F., Madden,T.L., Schaffer,A.A., Zhang,J., Zhang,Z., Miller,W. and Lipman,D.J. (1997) Nucleic Acids Res., 25, 3389–3402], this provides preliminary classification for easily recognisable homologues, which in the latest release of CATH (version 1.7) represented nearly three-quarters of the non-identical structures.
Resumo:
PDB-REPRDB is a database of representative protein chains from the Protein Data Bank (PDB). The previous version of PDB-REPRDB provided 48 representative sets, whose similarity criteria were predetermined, on the WWW. The current version is designed so that the user may obtain a quick selection of representative chains from PDB. The selection of representative chains can be dynamically configured according to the user’s requirement. The WWW interface provides a large degree of freedom in setting parameters, such as cut-off scores of sequence and structural similarity. One can obtain a representative list and classification data of protein chains from the system. The current database includes 20 457 protein chains from PDB entries (August 6, 2000). The system for PDB-REPRDB is available at the Parallel Protein Information Analysis system (PAPIA) WWW server (http://www.rwcp.or.jp/papia/).
Resumo:
TIGRFAMs is a collection of protein families featuring curated multiple sequence alignments, hidden Markov models and associated information designed to support the automated functional identification of proteins by sequence homology. We introduce the term ‘equivalog’ to describe members of a set of homologous proteins that are conserved with respect to function since their last common ancestor. Related proteins are grouped into equivalog families where possible, and otherwise into protein families with other hierarchically defined homology types. TIGRFAMs currently contains over 800 protein families, available for searching or downloading at www.tigr.org/TIGRFAMs. Classification by equivalog family, where achievable, complements classification by orthology, superfamily, domain or motif. It provides the information best suited for automatic assignment of specific functions to proteins from large-scale genome sequencing projects.
Resumo:
Macromolecular transport systems in bacteria currently are classified by function and sequence comparisons into five basic types. In this classification system, type II and type IV secretion systems both possess members of a superfamily of genes for putative NTP hydrolase (NTPase) proteins that are strikingly similar in structure, function, and sequence. These include VirB11, TrbB, TraG, GspE, PilB, PilT, and ComG1. The predicted protein product of tadA, a recently discovered gene required for tenacious adherence of Actinobacillus actinomycetemcomitans, also has significant sequence similarity to members of this superfamily and to several unclassified and uncharacterized gene products of both Archaea and Bacteria. To understand the relationship of tadA and tadA-like genes to those encoding the putative NTPases of type II/IV secretion, we used a phylogenetic approach to obtain a genealogy of 148 NTPase genes and reconstruct a scenario of gene superfamily evolution. In this phylogeny, clear distinctions can be made between type II and type IV families and their constituent subfamilies. In addition, the subgroup containing tadA constitutes a novel and extremely widespread subfamily of the family encompassing all putative NTPases of type IV secretion systems. We report diagnostic amino acid residue positions for each major monophyletic family and subfamily in the phylogenetic tree, and we propose an easy method for precisely classifying and naming putative NTPase genes based on phylogeny. This molecular key-based method can be applied to other gene superfamilies and represents a valuable tool for genome analysis.
Resumo:
The human prion gene contains five copies of a 24 nt repeat that is highly conserved among species. An analysis of folding free energies of the human prion mRNA, in particular in the repeat region, suggested biased codon selection and the presence of RNA patterns. In particular, pseudoknots, similar to the one predicted by Wills in the human prion mRNA, were identified in the repeat region of all available prion mRNAs available in GenBank, but not those of birds and the red slider turtle. An alignment of these mRNAs, which share low sequence homology, shows several co-variations that maintain the pseudoknot pattern. The presence of pseudoknots in yeast Sup35p and Rnq1 suggests acquisition in the prokaryotic era. Computer generated three-dimensional structures of the human prion pseudoknot highlight protein and RNA interaction domains, which suggest a possible effect in prion protein translation. The role of pseudoknots in prion diseases is discussed as individuals with extra copies of the 24 nt repeat develop the familial form of Creutzfeldt–Jakob disease.
Resumo:
SF3b155 is an essential spliceosomal protein, highly conserved during evolution. It has been identified as a subunit of splicing factor SF3b, which, together with a second multimeric complex termed SF3a, interacts specifically with the 12S U2 snRNP and converts it into the active 17S form. The protein displays a characteristic intranuclear localization. It is diffusely distributed in the nucleoplasm but highly concentrated in defined intranuclear structures termed “speckles,” a subnuclear compartment enriched in small ribonucleoprotein particles and various splicing factors. The primary sequence of SF3b155 suggests a multidomain structure, different from those of other nuclear speckles components. To identify which part of SF3b155 determines its specific intranuclear localization, we have constructed expression vectors encoding a series of epitope-tagged SF3b155 deletion mutants as well as chimeric combinations of SF3b155 sequences with the soluble cytoplasmic protein pyruvate kinase. Following transfection of cultured mammalian cells, we have identified (i) a functional nuclear localization signal of the monopartite type (KRKRR, amino acids 196–200) and (ii) a molecular segment with multiple threonine-proline repeats (amino acids 208–513), which is essential and sufficient to confer a specific accumulation in nuclear speckles. This latter sequence element, in particular amino acids 208–440, is required for correct subcellular localization of SF3b155 and is also sufficient to target a reporter protein to nuclear speckles. Moreover, this “speckle-targeting sequence” transfers the capacity for interaction with other U2 snRNP components.
Resumo:
We present an approach for assessing the significance of sequence and structure comparisons by using nearly identical statistical formalisms for both sequence and structure. Doing so involves an all-vs.-all comparison of protein domains [taken here from the Structural Classification of Proteins (scop) database] and then fitting a simple distribution function to the observed scores. By using this distribution, we can attach a statistical significance to each comparison score in the form of a P value, the probability that a better score would occur by chance. As expected, we find that the scores for sequence matching follow an extreme-value distribution. The agreement, moreover, between the P values that we derive from this distribution and those reported by standard programs (e.g., blast and fasta validates our approach. Structure comparison scores also follow an extreme-value distribution when the statistics are expressed in terms of a structural alignment score (essentially the sum of reciprocated distances between aligned atoms minus gap penalties). We find that the traditional metric of structural similarity, the rms deviation in atom positions after fitting aligned atoms, follows a different distribution of scores and does not perform as well as the structural alignment score. Comparison of the sequence and structure statistics for pairs of proteins known to be related distantly shows that structural comparison is able to detect approximately twice as many distant relationships as sequence comparison at the same error rate. The comparison also indicates that there are very few pairs with significant similarity in terms of sequence but not structure whereas many pairs have significant similarity in terms of structure but not sequence.
Resumo:
In vitro selection of nucleic acid binding species (aptamers) is superficially similar to the immune response. Both processes produce biopolymers that can recognize targets with high affinity and specificity. While antibodies are known to recognize the sequence and conformation of protein surface features (epitopes), very little is known about the precise interactions between aptamers and their epitopes. Therefore, aptamers that could recognize a particular epitope, a peptide fragment of human immunodeficiency virus type I Rev, were selected from a random sequence RNA pool. Several of the selected RNAs could bind the free peptide more tightly than a natural RNA ligand, the Rev-binding element. In accord with the hypothesis that protein and nucleic acid binding cusps are functionally similar, interactions between aptamers and the peptide target could be disrupted by sequence substitutions. Moreover, the aptamers appeared to be able to bind peptides with different solution conformations, implying an induced fit mechanism for binding. Just as anti-peptide antibodies can sometimes recognize the corresponding epitope when presented in a protein, the anti-peptide aptamers were found to specifically bind to Rev.
Resumo:
Approximately 40% of diffuse large cell lymphoma are associated with chromosomal translocations that deregulate the expression of the BCL6 gene by juxtaposing heterologous promoters to the BCL-6 coding domain. The BCL6 gene encodes a 95-kDa protein containing six C-terminal zinc-finger motifs and an N-terminal POZ domain, suggesting that it may function as a transcription factor. By using a DNA sequence selected for its ability to bind recombinant BCL-6 in vitro, we show here that BCL-6 is present in DNA-binding complexes in nuclear extracts from various B-cell lines. In transient transfectin experiments, BCL6 can repress transcription from promoters linked to its DNA target sequence and this activity is dependent upon specific DNA-binding and the presence of an intact N-terminal half of the protein. We demonstrate that this part of the BCL6 molecule contains an autonomous transrepressor domain and that two noncontiguous regions, including the POZ motif, mediate maximum transrepressive activity. These results indicate that the BCL-6 protein can function as a sequence-specific transcriptional repressor and have implications for the role of BCL6 in normal lymphoid development and lymphomagenesis.
Resumo:
A key event in Ras-mediated signal transduction and transformation involves Ras interaction with its downstream effector targets. Although substantial evidence has established that the Raf-1 serine/threonine kinase is a critical effector of Ras function, there is increasing evidence that Ras function is mediated through interaction with multiple effectors to trigger Raf-independent signaling pathways. In addition to the two Ras GTPase activating proteins (GAPs; p120- and NF1-GAP), other candidate effectors include activators of the Ras-related Ral proteins (RalGDS and RGL) and phosphatidylinositol 3-kinase. Interaction between Ras and its effectors requires an intact Ras effector domain and involves preferential recognition of active Ras-GTP. Surprisingly, these functionally diverse effectors lack significant sequence homology and no consensus Ras binding sequence has been described. We have now identified a consensus Ras binding sequence shared among a subset of Ras effectors. We have also shown that peptides containing this sequence from Raf-1 (RKTFLKLA) and NF1-GAP (RRFFLDIA) block NF1-GAP stimulation of Ras GTPase activity and Ras-mediated activation of mitogen-activated protein kinases. In summary, the identification of a consensus Ras-GTP binding sequence establishes a structural basis for the ability of diverse effector proteins to interact with Ras-GTP. Furthermore, our demonstration that peptides that contain Ras-GTP binding sequences can block Ras function provides a step toward the development of anti-Ras agents.