911 resultados para Sequence Motif
Resumo:
Yeast soluble proteins were fractionated by calmodulin-agarose affinity chromatography and the Ca2+/calmodulin-binding proteins were analyzed by SDS-PAGE. One prominent protein of 66 kDa was excised from the gel, digested with trypsin and the masses of the resultant fragments were determined by MALDI/MS. Twenty-one of 38 monoisotopic peptide masses obtained after tryptic digestion were matched to the heat shock protein Ssb1/Hsp75, covering 37% of its sequence. Computational analysis of the primary structure of Ssb1/Hsp75 identified a unique potential amphipathic alpha-helix in its N-terminal ATPase domain with features of target regions for Ca2+/calmodulin binding. This region, which shares 89% similarity to the experimentally determined calmodulin-binding domain from mouse, Hsc70, is conserved in near half of the 113 members of the HSP70 family investigated, from yeast to plant and animals. Based on the sequence of this region, phylogenetic analysis grouped the HSP70s in three distinct branches. Two of them comprise the non-calmodulin binding Hsp70s BIP/GR78, a subfamily of eukaryotic HSP70 localized in the endoplasmic reticulum, and DnaK, a subfamily of prokaryotic HSP70. A third heterogeneous group is formed by eukaryotic cytosolic HSP70s containing the new calmodulin-binding motif and other cytosolic HSP70s whose sequences do not conform to those conserved motif, indicating that not all eukaryotic cytosolic Hsp70s are target for calmodulin regulation. Furthermore, the calmodulin-binding domain found in eukaryotic HSP70s is also the target for binding of Bag-1 - an enhancer of ADP/ATP exchange activity of Hsp70s. A model in which calmodulin displaces Bag-1 and modulates Ssb1/Hsp75 chaperone activity is discussed.
Resumo:
Understanding the machinery of gene regulation to control gene expression has been one of the main focuses of bioinformaticians for years. We use a multi-objective genetic algorithm to evolve a specialized version of side effect machines for degenerate motif discovery. We compare some suggested objectives for the motifs they find, test different multi-objective scoring schemes and probabilistic models for the background sequence models and report our results on a synthetic dataset and some biological benchmarking suites. We conclude with a comparison of our algorithm with some widely used motif discovery algorithms in the literature and suggest future directions for research in this area.
Resumo:
La plupart des molécules d’ARN doivent se replier en structure tertiaire complexe afin d’accomplir leurs fonctions biologiques. Cependant, les déterminants d’une chaîne de polynucléotides qui sont nécessaires à son repliement et à ses interactions avec d’autres éléments sont essentiellement inconnus. L’établissement des relations structure-fonction dans les grandes molécules d’ARN passe inévitablement par l’analyse de chaque élément de leur structure de façon individuelle et en contexte avec d’autres éléments. À l’image d’une construction d’immeuble, une structure d’ARN est composée d’unités répétitives assemblées de façon spécifique. Les motifs récurrents d’ARN sont des arrangements de nucléotides retrouvés à différents endroits d’une structure tertiaire et possèdent des conformations identiques ou très similaires. Ainsi, une des étapes nécessaires à la compréhension de la structure et de la fonction des molécules d’ARN consiste à identifier de façon systématique les motifs récurrents et d’en effectuer une analyse comparative afin d’établir la séquence consensus. L’analyse de tous les cas d’empaquetage de doubles hélices dans la structure du ribosome a permis l’identification d’un nouvel arrangement nommé motif d’empaquetage le long du sillon (AGPM) (along-groove packing motif). Ce motif est retrouvé à 14 endroits dans la structure du ribosome de même qu’entre l’ARN ribosomique 23S et les molécules d’ARN de transfert liées aux sites ribosomaux P et E. Le motif se forme par l’empaquetage de deux doubles hélices via leur sillon mineur. Le squelette sucre-phosphate d’une hélice voyage le long du sillon mineur de l’autre hélice et vice versa. Dans chacune des hélices, la région de contact comprend quatre paires de bases. L’empaquetage le plus serré est retrouvé au centre de l’arrangement où l’on retrouve souvent une paire de bases GU dans une hélice interagissant avec une paire de bases Watson-Crick (WC) dans l’autre hélice. Même si la présence des paires de bases centrales GU versus WC au centre du motif augmente sa stabilité, d’autres alternatives existent pour différents représentants du motif. L’analyse comparative de trois librairies combinatoires de gènes d’AGPM, où les paires de bases centrales ont été variées de manière complètement aléatoire, a montré que le contexte structural influence l’étendue de la variabilité des séquences de nucléotides formant les paires de bases centrales. Le fait que l’identité des paires de bases centrales puisse varier suggérait la présence d’autres déterminants responsables au maintien de l’intégrité du motif. L’analyse de tous les contacts entre les hélices a révélé qu’en dehors du centre du motif, les interactions entre les squelettes sucre-phosphate s’effectuent via trois contacts ribose-ribose. Pour chacun de ces contacts, les riboses des nucléotides qui interagissent ensemble doivent adopter des positions particulières afin d’éviter qu’ils entrent en collision. Nous montrons que la position de ces riboses est modulée par des conformations spécifiques des paires de bases auxquelles ils appartiennent. Finalement, un autre motif récurrent identifié à l’intérieur même de la structure de trois cas d’AGPM a été nommé « adenosine-wedge ». Son analyse a révélé que ce dernier est lui-même composé d’un autre arrangement, nommé motif triangle-NAG (NAG-triangle). Nous montrons que le motif « adenosine-wedge » représente un arrangement complexe d’ARN composé de quatre éléments répétitifs, c’est-à-dire des motifs AGPM, « hook-turn », « A-minor » et triangle-NAG. Ceci illustre clairement l’arrangement hiérarchique des structures d’ARN qui peut aussi être observé pour d’autres motifs d’ARN. D’un point de vue plus global, mes résultats enrichissent notre compréhension générale du rôle des différents types d’interactions tertiaires dans la formation des molécules d’ARN complexes.
Resumo:
The recently described cupin superfamily of proteins includes the germin and germinlike proteins, of which the cereal oxalate oxidase is the best characterized. This superfamily also includes seed storage proteins, in addition to several microbial enzymes and proteins with unknown function. All these proteins are characterized by the conservation of two central motifs, usually containing two or three histidine residues presumed to be involved with metal binding in the catalytic active site. The present study on the coding regions of Synechocystis PCC6803 identifies a previously unknown group of 12 related cupins, each containing the characteristic two-motif signature. This group comprises 11 single-domain proteins, ranging in length from 104 to 289 residues, and includes two phosphomannose isomerases and two epimerases involved in cell wall synthesis, a member of the pirin group of nuclear proteins, a possible transcriptional regulator, and a close relative-of a cytochrome c551 from Rhodococcus. Additionally, there is a duplicated, two-domain protein that has close similarity to an oxalate decarboxylase from the fungus Collybia velutipes and that is a putative progenitor of the storage proteins of land plants.
Resumo:
Self-assembly in aqueous solution has been investigated for two Fmoc [Fmoc ¼ N-(fluorenyl)-9-methoxycarbonyl] tetrapeptides comprising the RGDS cell adhesion motif from fibronectin or the scrambled sequence GRDS. The hydrophobic Fmoc unit confers amphiphilicity on the molecules, and introduces aromatic stacking interactions. Circular dichroism and FTIR spectroscopy show that the self-assembly of both peptides at low concentration is dominated by interactions among Fmoc units, although Fmoc-GRDS shows b-sheet features, at lower concentration than Fmoc-RGDS. Fibre X-ray diffraction indicates b-sheet formation by both peptides at sufficiently high concentration. Strong alignment effects are revealed by linear dichroism experiments for Fmoc-GRDS. Cryo-TEM and smallangle X-ray scattering (SAXS) reveal that both samples form fibrils with a diameter of approximately 10 nm. Both Fmoc-tetrapeptides form self-supporting hydrogels at sufficiently high concentration. Dynamic shear rheometry enabled measurements of the moduli for the Fmoc-GRDS hydrogel, however syneresis was observed for the Fmoc-RGDS hydrogel which was significantly less stable to shear. Molecular dynamics computer simulations were carried out considering parallel and antiparallel b-sheet configurations of systems containing 7 and 21 molecules of Fmoc-RGDS or Fmoc-GRDS, the results being analyzed in terms of both intermolecular structural parameters and energy contributions.
Resumo:
Snakebites are a major neglected tropical disease responsible for as many as 95000 deaths every year worldwide. Viper venom serine proteases disrupt haemostasis of prey and victims by affecting various stages of the blood coagulation system. A better understanding of their sequence, structure, function and phylogenetic relationships will improve the knowledge on the pathological conditions and aid in the development of novel therapeutics for treating snakebites. A large dataset for all available viper venom serine proteases was developed and analysed to study various features of these enzymes. Despite the large number of venom serine protease sequences available, only a small proportion of these have been functionally characterised. Although, they share some of the common features such as a C-terminal extension, GWG motif and disulphide linkages, they vary widely between each other in features such as isoelectric points, potential N-glycosylation sites and functional characteristics. Some of the serine proteases contain substitutions for one or more of the critical residues in catalytic triad or primary specificity pockets. Phylogenetic analysis clustered all the sequences in three major groups. The sequences with substitutions in catalytic triad or specificity pocket clustered together in separate groups. Our study provides the most complete information on viper venom serine proteases to date and improves the current knowledge on the sequence, structure, function and phylogenetic relationships of these enzymes. This collective analysis of venom serine proteases will help in understanding the complexity of envenomation and potential therapeutic avenues.
Resumo:
CLEC-2 is a member of new family of C-type lectin receptors characterized by a cytosolic YXXL downstream of three acidic amino acids in a sequence known as a hemITAM (hemi-immunoreceptor tyrosine-based activation motif). Dimerization of two phosphorylated CLEC-2 molecules leads to recruitment of the tyrosine kinase Syk via its tandem SH2 domains and initiation of a downstream signaling cascade. Using Syk-deficient and Zap-70-deficient cell lines we show that hemITAM signaling is restricted to Syk and that the upstream triacidic amino acid sequence is required for signaling. Using surface plasmon resonance and phosphorylation studies, we demonstrate that the triacidic amino acids are required for phosphorylation of the YXXL. These results further emphasize the distinct nature of the proximal events in signaling by hemITAM relative to ITAM receptors.
Resumo:
The C-type lectin-like receptor CLEC-2 signals via phosphorylation of a single cytoplasmic YXXL sequence known as a hem-immunoreceptor tyrosine-based activation motif (hemITAM). In this study, we show that phosphorylation of CLEC-2 by the snake toxin rhodocytin is abolished in the absence of the tyrosine kinase Syk but is not altered in the absence of the major platelet Src family kinases, Fyn, Lyn, and Src, or the tyrosine phosphatase CD148, which regulates the basal activity of Src family kinases. Further, phosphorylation of CLEC-2 by rhodocytin is not altered in the presence of the Src family kinase inhibitor PP2, even though PLCγ2 phosphorylation and platelet activation are abolished. A similar dependence of phosphorylation of CLEC-2 on Syk is also seen in response to stimulation by an IgG mAb to CLEC-2, although interestingly CLEC-2 phosphorylation is also reduced in the absence of Lyn. These results provide the first definitive evidence that Syk mediates phosphorylation of the CLEC-2 hemITAM receptor with Src family kinases playing a critical role further downstream through the regulation of Syk and other effector proteins, providing a new paradigm in signaling by YXXL-containing receptors.
Resumo:
Musca domestica larvae display in anterior and middle midgut contents, a proteolytic activity with pH optimum of 3.0-3.5 and kinetic properties like cathepsin D. Three cDNAs coding for preprocathepsin D-like proteinases (ppCAD 1, ppCAD 2, ppCAD 3) were cloned from a M. domestica midgut cDNA library. The coded protein sequences included the signal peptide, propeptide and mature enzyme that has all conserved catalytic and substrate binding residues found in bovine lysosomal cathepsin D. Nevertheless, ppCAD 2 and ppCAD 3 lack the characteristic proline loop and glycosylation sites. A comparison among the sequences of cathepsin D-like enzymes from some vertebrates and those found in M. domestica and in the genomes of Aedes aegypti, Drosophila melanogaster, Tribolium castaneum, and Bombyx mori showed that only flies have enzymes lacking the proline loop (as defined by the motif: DxPxPx(G/A)P), thus resembling vertebrate pepsin. ppCAD 3 should correspond to the digestive cathepsin D-like proteinase (CAD) found in enzyme assays because: (1) it seems to be the most expressed CAD, based on the frequency of ESTs found. (2) The mRNA for CAD 3 is expressed only in the anterior and proximal middle midgut. (3) Recombinant procathepsin D-like proteinase (pCAD 3), after auto-activation has a pH optimum of 2.5-3.0 that is close to the luminal pH of M. domestica midgut. (4) Immunoblots of proteins from different tissues revealed with anti-pCAD 3 serum were positive only in samples of anterior and middle midgut tissue and contents. (5) CAD 3 is localized with immunogold inside secretory vesicles and around microvilli in anterior and middle midguit cells. The data support the view that on adapting to deal with a bacteria-rich food in an acid midgut region, M. domestica digestive CAD resulted from the same archetypical gene as the intracellular cathepsin D, paralleling what happened with vertebrates. The lack of the proline loop may be somehow associated with the extracellular role of both pepsin and digestive CAD 3. (C) 2009 Elsevier Ltd. All rights reserved.
Resumo:
Glycine-rich proteins (GRP), serve a variety of biological functions. Acanthoscurrin is an antimicrobial GRP isolated front hemocytes-of the Brazilian spider Acanthoscurria gomesiana. Aiming to contribute to the knowledge of the secondary structure and stepwise solid-phase synthesis of GRPs` glycine-rich domains, we attempted to prepare G(101)GGLGGGRGGGYG(113) GGGGYGGGYG(123)GGy(126)GGGKYK(132)-NH(2), acanthoscurrin C-terminal amidated fragment. Although a theoretical prediction did not indicate high aggregation potential for this peptide, repetitive incomplete aminoacylations were observed after incorporating Tyr(126) to the growing peptide-MBHA resin (Boc chemistry) at 60 degrees C. The problem was not solved by varying the coupling reagents or solvents, adding chaotropic salts to the reaction media or changing the resin/chemistry (Rink amide resin/Fmoc chemistry). Some improvement was mode when CLEAR amide resin (Fmoc chemistry) was 32 used, as it allowed for obtaining fragment (G(113)-K(132) NIR-FT-Raman spectra collected for samples of the growing peptide-MBHA, -Rink amide resin and -CLEAR amide resin revealed the presence of beta-sheet structures. Only the combination of CLEAR-amide resin, 60 degrees C, Fmoc-(Fmoc-Hmb)Gly-OH and LiCl (the last two used alternately) was able to inhibit the phenomenon, as proven by NIR-FT-Raman analysis of the growing peptide-resin, allowing the total synthesis of desired 132 fragment Gly(101)-K(132). In summary, this work describes a new difficult sequence, contributes to understanding stepwise solid-phase synthesis of this type of peptide and shows that, at least while protected and linked to a resin, this GRPs glycine-rich motif presents all early tendency to assume beta-sheet structures. (c) 2008 Wiley Periodicals, Inc. Biopolymers (Pept Sci) 92: 65-75, 2009.
Resumo:
Triatoma infestans (Hemiptera: Reduviidae) is a hematophagous insect that transmits the protozoan parasite Trypanosoma cruzi, the etiological agent of Chagas` disease. Its saliva contains trialysin, a protein that forms pores in membranes. Peptides based on the N-terminus of trialysin lyse cells and fold into alpha-helical amphipathic segments resembling antimicrobial peptides. Using a specific antiserum against trialysin, we show here that trialysin is synthesized as a precursor that is less active than the protein released after saliva secretion. A synthetic peptide flanked by a fluorophore and a quencher including the acidic proregion and the lytic N-terminus of the protein is also less active against cells and liposomes, increasing activity upon proteolysis. Activation changes the peptide conformation as observed by fluorescence increase and CD spectroscopy. This mechanism of activation could provide a way to impair the toxic effects of trialysin inside the salivary glands, thus restricting damaging lytic activity to the bite site.
Resumo:
The diversity of the V3 loop tip motif sequences of HIV-1 subtype B was analyzed in patients from Botucatu (Brazil) and Montpellier (France). Overall, 37 tetrameric tip motifs were identified, 28 and 17 of them being recognized in Brazilian and French patients, respectively. The GPGR (P) motif was predominant in French but not in Brazilian patients (53.5% vs 31.0%), whereas the GWGR (W) motif was frequent in Brazilian patients (23.0%) and absent in French patients. Three tip motif groups were considered: P, W, and non-P non-W groups. The distribution of HIV-1 isolates into the three groups was significantly different between isolates from Botucatu and from Montpellier (P < 0.001). A higher proportion of CXCR4-using HIV-1 (X4 variants) was observed in the non-P non-W group as compared with the P group (37.5% vs 19.1%), and no X4 variant was identified in the W group (P < 0.001). The higher proportion of X4 variants in the non-P non-W group was essentially observed among the patients from Montpellier, who have been infected with HIV-1 for a longer period of time than those from Botucatu. Among patients from Montpellier, CD4+ cell counts were lower in patients belonging to the non-P non-W group than in those belonging to the P group (24 cells/µL vs 197 cells/µL; P = 0.005). Taken together, the results suggest that variability of the V3 loop tip motif may be related to HIV-1 coreceptor usage and to disease progression. However, as analyzed by a bioinformatic method, the substitution of the V3 loop tip motif of the subtype B consensus sequence with the different tip motifs identified in the present study was not sufficient to induce a change in HIV-1 coreceptor usage.
Resumo:
Although the retrotransposon copia has been studied in the melanogaster group of Drosophila species, very little is known about copia dynamism and evolution in other groups. We analyzed the occurrence and heterogeneity of the copia 5' LTR-ULR partial sequence and their phylogenetic relationships in 24 species of the repleta group of Drosophila. PCR showed that copia occurs in 18 out of the 24 species evaluated. Sequencing was possible in only eight species. The sequences showed a low nucleotide diversity, which suggests selective constraints maintaining this regulatory region over evolutionary time. on the contrary, the low nucleotide divergence and the phylogenetic relationships between the D. willistoni/Zaprionus tuberculatus/melanogaster species subgroup suggest horizontal transfer. Sixteen transcription factor binding sites were identified in the LTR-ULR repleta and melanogaster consensus sequences. However, these motifs are not homologous, neither according to their position in the LTR-ULR sequences, nor according to their sequences. Taken together, the low motif homologies, the phylogenetic relationship and the great nucleotide divergence between the melanogaster and repleta copia sequences reinforce the hypothesis that there are two copia families.
Resumo:
Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)
Resumo:
Background: In protein sequence classification, identification of the sequence motifs or n-grams that can precisely discriminate between classes is a more interesting scientific question than the classification itself. A number of classification methods aim at accurate classification but fail to explain which sequence features indeed contribute to the accuracy. We hypothesize that sequences in lower denominations (n-grams) can be used to explore the sequence landscape and to identify class-specific motifs that discriminate between classes during classification. Discriminative n-grams are short peptide sequences that are highly frequent in one class but are either minimally present or absent in other classes. In this study, we present a new substitution-based scoring function for identifying discriminative n-grams that are highly specific to a class. Results: We present a scoring function based on discriminative n-grams that can effectively discriminate between classes. The scoring function, initially, harvests the entire set of 4- to 8-grams from the protein sequences of different classes in the dataset. Similar n-grams of the same size are combined to form new n-grams, where the similarity is defined by positive amino acid substitution scores in the BLOSUM62 matrix. Substitution has resulted in a large increase in the number of discriminatory n-grams harvested. Due to the unbalanced nature of the dataset, the frequencies of the n-grams are normalized using a dampening factor, which gives more weightage to the n-grams that appear in fewer classes and vice-versa. After the n-grams are normalized, the scoring function identifies discriminative 4- to 8-grams for each class that are frequent enough to be above a selection threshold. By mapping these discriminative n-grams back to the protein sequences, we obtained contiguous n-grams that represent short class-specific motifs in protein sequences. Our method fared well compared to an existing motif finding method known as Wordspy. We have validated our enriched set of class-specific motifs against the functionally important motifs obtained from the NLSdb, Prosite and ELM databases. We demonstrate that this method is very generic; thus can be widely applied to detect class-specific motifs in many protein sequence classification tasks. Conclusion: The proposed scoring function and methodology is able to identify class-specific motifs using discriminative n-grams derived from the protein sequences. The implementation of amino acid substitution scores for similarity detection, and the dampening factor to normalize the unbalanced datasets have significant effect on the performance of the scoring function. Our multipronged validation tests demonstrate that this method can detect class-specific motifs from a wide variety of protein sequence classes with a potential application to detecting proteome-specific motifs of different organisms.