964 resultados para Amino-acid Sequences
Resumo:
An increasing number of proteins with weak sequence similarity have been found to assume similar three-dimensional fold and often have similar or related biochemical or biophysical functions. We propose a method for detecting the fold similarity between two proteins with low sequence similarity based on their amino acid properties alone. The method, the proximity correlation matrix (PCM) method, is built on the observation that the physical properties of neighboring amino acid residues in sequence at structurally equivalent positions of two proteins of similar fold are often correlated even when amino acid sequences are different. The hydrophobicity is shown to be the most strongly correlated property for all protein fold classes. The PCM method was tested on 420 proteins belonging to 64 different known folds, each having at least three proteins with little sequence similarity. The method was able to detect fold similarities for 40% of the 420 sequences. Compared with sequence comparison and several fold-recognition methods, the method demonstrates good performance in detecting fold similarities among the proteins with low sequence identity. Applied to the complete genome of Methanococcus jannaschii, the method recognized the folds for 22 hypothetical proteins.
Resumo:
Extracellular superoxide dismutase (EC-SOD) is a secreted Cu and Zn-containing glycoprotein. While EC-SOD from most mammals is tetrameric and has a high affinity for heparin and heparan sulfate, rat EC-SOD has a low affinity for heparin, does not bind to heparan sulfate in vivo, and is apparently dimeric. To examine the molecular basis of the deviant physical properties of rat EC-SOD, the cDNAs of the rat and mouse EC-SODs were isolated and the deduced amino acid sequences were compared with that of human EC-SOD. Comparison of the sequences offered no obvious explanation of the differences. Analysis of a series of chimeric and point mutated EC-SODs showed that the N-terminal region contributes to the oligomeric state of the EC-SODs, and that a single amino acid, a valine (human amino acid position 24), is essential for the tetramerization. This residue is replaced by an aspartate in the rat. Rat EC-SOD carrying an Asp --> Val mutation is tetrameric and has a high heparin affinity, while mouse EC-SOD with a Val --> Asp mutation is dimeric and has lost its high heparin affinity. Thus, the rat EC-SOD dimer is converted to a tetramer by the exchange of a single amino acid. Furthermore, the cooperative action of four heparin-binding domains is necessary for high heparin affinity. These results also suggest that tetrameric EC-SODs are not symmetrical tetrahedrons, but composed of two interacting dimers, further supporting an evolutionary relationship with the dimeric cytosolic Cu and Zn-containing SODs.
Resumo:
Since ribosomally mediated protein biosynthesis is confined to the L-amino acid pool, the presence of D-amino acids in peptides was considered for many years to be restricted to proteins of prokaryotic origin. Unicellular microorganisms have been responsible for the generation of a host of D-amino acid-containing peptide antibiotics (gramicidin, actinomycin, bacitracin, polymyxins). Recently, a series of mu and delta opioid receptor agonists [dermorphins and deltorphins] and neuroactive tetrapeptides containing a D-amino acid residue have been isolated from amphibian (frog) skin and mollusks. Amino acid sequences obtained from the cDNA libraries coincide with the observed dermorphin and deltorphin sequences, suggesting a stereospecific posttranslational amino acid isomerization of unknown mechanism. A cofactor-independent serine isomerase found in the venom of the Agelenopsis aperta spider provides the first major clue to explain how multicellular organisms are capable of incorporating single D-amino acid residues into these and other eukaryotic peptides. The enzyme is capable of isomerizing serine, cysteine, O-methylserine, and alanine residues in the middle of peptide chains, thereby providing a biochemical capability that, until now, had not been observed. Both D- and L-amino acid residues are susceptible to isomerization. The substrates share a common Leu-Xaa-Phe-Ala recognition site. Early in the reaction sequence, solvent-derived deuterium resides solely with the epimerized product (not substrate) in isomerizations carried out in 2H2O. Significant deuterium isotope effects are obtained in these reactions in addition to isomerizations of isotopically labeled substrates (2H at the epimerizeable serine alpha-carbon atom). The combined kinetic and structural data suggests a two-base mechanism in which abstraction of a proton from one face is concomitant with delivery from the opposite face by the conjugate acid of the second enzymic base.
Resumo:
The amino acid sequences of a number of closely related proteins ("napin") isolated from Brassica napus were determined by mass spectrometry without prior separation into individual components. Some of these proteins correspond to those previously deduced (napA, BngNAP1, and gNa), chiefly from DNA sequences. Others were found to differ to a varying extent (BngNAP1', BngNAP1A, BngNAP1B, BngNAP1C, gNa', and gNaA). The short chains of gNa and gNa' and of BngNAP1 and BngNAP1' differ by the replacement of N-terminal proline by pyroglutamic acid; the long chains of gNaA and BngNAP1B contain a six amino acid stretch, MQGQQM, which is present in gNa (according to its DNA sequence) but absent from BngNAP1 and BngNAP1C. These alternations of sequences between napin isoforms are most likely due to homologous recombination of the genetic material, but some of the changes may also be due to RNA editing. The amino acids that follow the untruncated C termini of those napin chains for which the DNA sequences are known (napA, BngNAP1, and gNa) are aromatic amino acids. This suggests that the processing of the proprotein leading to the C termini of the two chains is due to the action of a protease that specifically cleaves a G/S-F/Y/W bond.
Resumo:
Background: Designing novel proteins with site-directed recombination has enormous prospects. By locating effective recombination sites for swapping sequence parts, the probability that hybrid sequences have the desired properties is increased dramatically. The prohibitive requirements for applying current tools led us to investigate machine learning to assist in finding useful recombination sites from amino acid sequence alone. Results: We present STAR, Site Targeted Amino acid Recombination predictor, which produces a score indicating the structural disruption caused by recombination, for each position in an amino acid sequence. Example predictions contrasted with those of alternative tools, illustrate STAR'S utility to assist in determining useful recombination sites. Overall, the correlation coefficient between the output of the experimentally validated protein design algorithm SCHEMA and the prediction of STAR is very high (0.89). Conclusion: STAR allows the user to explore useful recombination sites in amino acid sequences with unknown structure and unknown evolutionary origin. The predictor service is available from http://pprowler.itee.uq.edu.au/star.
Resumo:
Subunit vaccine discovery is an accepted clinical priority. The empirical approach is time- and labor-consuming and can often end in failure. Rational information-driven approaches can overcome these limitations in a fast and efficient manner. However, informatics solutions require reliable algorithms for antigen identification. All known algorithms use sequence similarity to identify antigens. However, antigenicity may be encoded subtly in a sequence and may not be directly identifiable by sequence alignment. We propose a new alignment-independent method for antigen recognition based on the principal chemical properties of protein amino acid sequences. The method is tested by cross-validation on a training set of bacterial antigens and external validation on a test set of known antigens. The prediction accuracy is 83% for the cross-validation and 80% for the external test set. Our approach is accurate and robust, and provides a potent tool for the in silico discovery of medically relevant subunit vaccines.
Resumo:
Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)
Resumo:
The aim of this work was to investigate the involvement of caspases in apoptosis induced by L-amino acid oxidase isolated from Bothrops atrox snake venom. The isolation of LAAO involved three chromatographic steps: molecular exclusion on a G-75 column; ion exchange column by HPLC and affinity chromatography on a Lentil Lectin column. SDS-PAGE was used to confirm the expected high purity level of BatroxLAA0. It is a glycoprotein with 12% sugar and an acidic character, as confirmed by its amino acid composition, rich in ""Asp and Glu"" residues. It displays high specificity toward hydrophobic L-amino acids. The N-terminal amino acid sequence and internal peptide sequences showed close structural homology to other snake venom LAAOs. This enzyme induces in vitro platelet aggregation, which may be due to H(2)O(2) production by LAAOs, since the addition of catalase completely inhibited the aggregation effect. It also showed cytotoxicity towards several cancer cell lines: HL60, Jurkat, B16F10 and PC12. The cytotoxicity activity was abolished by catalase. A fluorescence microscopy evaluation revealed a significant increase in the apoptotic index of these cells after BatroxLAAO treatment. This observation was confirmed by phosphatidyl serine exposure and activation of caspases. BatroxLAAO is a protein with various biological functions that can be involved in envenomation. Further investigations of its function will contribute to toxicology advances. Published by Elsevier Inc.
Resumo:
As a consequence of selective pressure exerted by the immune response during hepatitis C virus (HCV) infection, a high rate of nucleotide mutations in the viral genome is observed which leads to the emergence of viral escape mutants. The aim of this study was to evaluate the evolution of the amino acid (aa) sequence of the HCV nonstructural protein 3 (NS3) in viral isolates after liver transplantation. Six patients with HCV-induced liver disease undergoing liver transplantation (LT) were followed up for sequence analysis. Hepatitis C recurrence was observed in all patients after LT. The rate of synonymous (dS) nucleotide substitutions was much higher than that of nonsynonymous (dN) ones in the NS3 encoding region. The high values of the dS/dN ratios suggest no sustained adaptive evolution selection pressure and, therefore, absence of specific NS3 viral populations. Clinical genotype assignments were supported by phylogenetic analysis. Serial samples from each patient showed lower mean nucleotide genetic distance when compared with samples of the same HCV genotype and subtype. The NS3 samples studied had an N-terminal aa sequence with several differences as compared with reference ones, mainly in genotype 1b-infected patients. After LT, as compared with the sequences before, a few reverted aa substitutions and several established aa substitutions were observed at the N-terminal of NS3. Sites described to be involved in important functions of NS3, notably those of the catalytic triad and zinc binding, remained unaltered in terms of aa sequence. Rare or frequent aa substitutions occurred indiscriminately in different positions. Several cytotoxic T lymphocyte epitopes described for HCV were present in our 1b samples. Nevertheless, the deduced secondary structure of the NS3 protease showed a few alterations in samples from genotype 3a patients, but none were seen in 1b cases. Our data, obtained from patients under important selective pressure during LT, show that the NS3 protease remains well conserved, mainly in HCV 3a patients. It reinforces its potential use as an antigenic candidate for further studies aiming at the development of a protective immune response.
Resumo:
A plausible approach to evaluate the inhibitory action of antifungals is through the investigation of the fungal resistance to these drugs. We describe here the molecular cloning and initial characterization of the A. nidulans lipA gene, where mutation (lipA1) conferred resistance to undecanoic acid, the most fungitoxic fatty acid in the C(7:0)-C(18:0) series. The lipA gene codes for a putative lipase with the sequence consensus GVSIS and WIFGGG as the catalytic signature. Comparison of the wild-type and LIP1 mutant strain nucleotide sequences showed a G -> A change in lipA1 allele, which results in a Glu(214) -> Lys substitution in LipA protein. This ionic charge change in a conserved LipA region, next to its catalytic site, may have altered the catalytic properties of this enzyme resulting in resistance to undecanoic acid.
Resumo:
BACKGROUND: The expansion of amino acid repeats is determined by a high mutation rate and can be increased or limited by selection. It has been suggested that recent expansions could be associated with the potential of adaptation to new environments. In this work, we quantify the strength of this association, as well as the contribution of potential confounding factors. RESULTS: Mammalian positively selected genes have accumulated more recent amino acid repeats than other mammalian genes. However, we found little support for an accelerated evolutionary rate as the main driver for the expansion of amino acid repeats. The most significant predictors of amino acid repeats are gene function and GC content. There is no correlation with expression level. CONCLUSIONS: Our analyses show that amino acid repeat expansions are causally independent from protein adaptive evolution in mammalian genomes. Relaxed purifying selection or positive selection do not associate with more or more recent amino acid repeats. Their occurrence is slightly favoured by the sequence context but mainly determined by the molecular function of the gene.
Resumo:
Background: Amino acid tandem repeats are found in nearly one-fifth of human proteins. Abnormal expansion of these regions is associated with several human disorders. To gain further insight into the mutational mechanisms that operate in this type of sequence, we have analyzed a large number of mutation variants derived from human expressed sequence tags (ESTs).Results: We identified 137 polymorphic variants in 115 different amino acid tandem repeats. Of these, 77 contained amino acid substitutions and 60 contained gaps (expansions or contractions of the repeat unit). The analysis showed that at least about 21% of the repeats might be polymorphic in humans. We compared the mutations found in different types of amino acid repeats and in adjacent regions. Overall, repeats showed a five-fold increase in the number of gap mutations compared to adjacent regions, reflecting the action of slippage within the repetitive structures. Gap and substitution mutations were very differently distributed between different amino acid repeat types. Among repeats containing gap variants we identified several disease and candidate disease genes.Conclusion: This is the first report at a genome-wide scale of the types of mutations occurring in the amino acid repeat component of the human proteome. We show that the mutational dynamics of different amino acid repeat types are very diverse. We provide a list of loci with highly variable repeat structures, some of which may be potentially involved in disease.
Resumo:
Amino acid tandem repeats, also called homopolymeric tracts, are extremely abundant in eukaryotic proteins. To gain insight into the genome-wide evolution of these regions in mammals, we analyzed the repeat content in a large data set of rat-mouse-human orthologs. Our results show that human proteins contain more amino acid repeats than rodent proteins and that trinucleotide repeats are also more abundant in human coding sequences. Using the human species as an outgroup, we were able to address differences in repeat loss and repeat gain in the rat and mouse lineages. In this data set, mouse proteins contain substantially more repeats than rat proteins, which can be at least partly attributed to a higher repeat loss in the rat lineage. The data are consistent with a role for trinucleotide slippage in the generation of novel amino acid repeats. We confirm the previously observed functional bias of proteins with repeats, with overrepresentation of transcription factors and DNA-binding proteins. We show that genes encoding amino acid repeats tend to have an unusually high GC content, and that differences in coding GC content among orthologs are directly related to the presence/absence of repeats. We propose that the different GC content isochore structure in rodents and humans may result in an increased amino acid repeat prevalence in the human lineage.
Resumo:
Aromatic amino acid hydroxylase (AAAH) genes and insulin-like genes form part of an extensive paralogy region shared by human chromosomes 11 and 12, thought to have arisen by tetraploidy in early vertebrate evolution. Cloning of a complementary DNA (cDNA) for an amphioxus (Branchiostoma floridae) hydroxylase gene (AmphiPAH) allowed us to investigate the ancestry of the human chromosome 11/12 paralogy region. Molecular phylogenetic evidence reveals that AmphiPAH is orthologous to vertebrate phenylalanine (PAH) genes; the implication is that all three vertebrate AAAH genes arose early in metazoan evolution, predating vertebrates. In contrast, our phylogenetic analysis of amphioxus and vertebrate insulin-related gene sequences is consistent with duplication of these genes during early chordate ancestry. The conclusion is that two tightly linked gene families on human chromosomes 11 and 12 were not duplicated coincidentally. We rationalize this paradox by invoking gene loss in the AAAH gene family and conclude that paralogous genes shared by paralogous chromosomes need not have identical evolutionary histories.
Resumo:
The elucidation of the domain content of a given protein sequence in the absence of determined structure or significant sequence homology to known domains is an important problem in structural biology. Here we address how successfully the delineation of continuous domains can be accomplished in the absence of sequence homology using simple baseline methods, an existing prediction algorithm (Domain Guess by Size), and a newly developed method (DomSSEA). The study was undertaken with a view to measuring the usefulness of these prediction methods in terms of their application to fully automatic domain assignment. Thus, the sensitivity of each domain assignment method was measured by calculating the number of correctly assigned top scoring predictions. We have implemented a new continuous domain identification method using the alignment of predicted secondary structures of target sequences against observed secondary structures of chains with known domain boundaries as assigned by Class Architecture Topology Homology (CATH). Taking top predictions only, the success rate of the method in correctly assigning domain number to the representative chain set is 73.3%. The top prediction for domain number and location of domain boundaries was correct for 24% of the multidomain set (±20 residues). These results have been put into context in relation to the results obtained from the other prediction methods assessed