929 resultados para protein sequence classification


Relevância:

90.00% 90.00%

Publicador:

Resumo:

Background: The amino terminal half of the cellular prion protein PrPc is implicated in both the binding of copper ions and the conformational changes that lead to disease but has no defined structure. However, as some structure is likely to exist we have investigated the use of an established protein refolding technology, fusion to green fluorescence protein (GFP), as a method to examine the refolding of the amino terminal domain of mouse prion protein. Results: Fusion proteins of PrPc and GFP were expressed at high level in E. coli and could be purified to near homogeneity as insoluble inclusion bodies. Following denaturation, proteins were diluted into a refolding buffer whereupon GFP fluorescence recovered with time. Using several truncations of PrPc the rate of refolding was shown to depend on the prion sequence expressed. In a variation of the format, direct observation in E. coli, mutations introduced randomly in the PrPc protein sequence that affected folding could be selected directly by recovery of GFP fluorescence. Conclusion: Use of GFP as a measure of refolding of PrPc fusion proteins in vitro and in vivo proved informative. Refolding in vitro suggested a local structure within the amino terminal domain while direct selection via fluorescence showed that as little as one amino acid change could significantly alter folding. These assay formats, not previously used to study PrP folding, may be generally useful for investigating PrPc structure and PrPc-ligand interaction.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

We show that most isolates of influenza A induce filamentous changes in infected cells in contrast to A/WSN/33 and A/PR8/34 strains which have undergone extensive laboratory passage and are mouse-adapted. Using reverse genetics, we created recombinant viruses in the naturally filamentous genetic background of A/Victoria/3/75 and established that this property is regulated by the M1 protein sequence, but that the phenotype is complex and several residues are involved. The filamentous phenotype was lost when the amino acid at position 41 was switched from A to V, at the same time, this recombinant virus also became insensitive to the antibody 14C2. On the other hand, the filamentous phenotype could be fully transferred to a virus containing RNA segment 7 of the A/WSN/33 virus by a combination of three mutations in both the amino and carboxy regions of the M1 protein. This observation suggests that an interaction among these regions of M1 may occur during assembly. (C) 2004 Elsevier Inc. All rights reserved.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Dynamically disordered regions appear to be relatively abundant in eukaryotic proteomes. The DISOPRED server allows users to submit a protein sequence, and returns a probability estimate of each residue in the sequence being disordered. The results are sent in both plain text and graphical formats, and the server can also supply predictions of secondary structure to provide further structural information.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The PSIPRED protein structure prediction server allows users to submit a protein sequence, perform a prediction of their choice and receive the results of the prediction both textually via e-mail and graphically via the web. The user may select one of three prediction methods to apply to their sequence: PSIPRED, a highly accurate secondary structure prediction method; MEMSAT 2, a new version of a widely used transmembrane topology prediction method; or GenTHREADER, a sequence profile based fold recognition method.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Abscisic acid (ABA)-mediated gene expression is a critical component of plant responses to this important hormone, which affects plant growth, development, and responses to environmental stresses. Plant responses to ABA are mediated by a number of factors including PKABA1, an ABA induced protein kinase involved in ABA-suppressed gene expression in cereal grains, and TaWD40, which has previously been shown to physically interact with PKABA1. A full-length 1.9 kb TaWD40 cDNA, CK210682, was sequenced as part of this project. Based on the deduced protein sequence, it is thought that TaWD40 may belong to the family of E3 ubiquitin ligases, possibly targeting PKABA1 for destruction. Construction of expression plasmids for overproduction of the TaWD40 polypeptide in E. coli is currently underway. The TaWD40 cDNA has been successfully amplified from the source plasmid and inserted into an intermediate plasmid, pCR2.1. The TaWD40 cDNA is currently being cloned from the pCR2.1 intermediate plasmid into two different expression vectors, pRSET-A and pMAL-c2x, for future protein production and purification.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Within about 30 years the Brazilian buffalo (Bubalus bubalis) herd will reach approximately 50 million head as a result of the great adaptive capacity of these animals to tropical climates, together with the good productive and reproductive potential which make these animals an important animal protein source for poor and developing countries. The myostatin gene (GDF8) is important in the physiology of stock animals because its product produces a direct effect on muscle development and consequently also on meat production. The myostatin sequence is known in several mammalian species and shows a high degree of amino acid sequence conservation, although the presence of non-silent and silent changes in the coding sequences and several alterations in the introns and untranslated regions have been identified. The objective of our work was to characterize the myostatin coding regions of B. bubalis (Murrah breed) and to compare them with the Bos taurus regions looking for variations in nucleotide and protein sequences. In this way, we were able to identify 12 variations at DNA level and five alterations on the presumed myostatin protein sequence as compared to non double-muscled bovine sequences.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

A lectin-like protein from the seeds of Acacia farnesiana was isolated from the albumin fraction, characterized, and sequenced by tandem mass spectrometry. The albumin fraction was extracted with 0.5 M NaCl, and the lectin-like protein of A. farnesiana (AFAL) was purified by ion-exchange chromatography (Mono-Q) followed by chromatofocusing. AFAL agglutinated rabbit erythrocytes and did not agglutinate human ABO erythrocytes either native or treated with proteolytic enzymes. In sodium dodecyl sulfate gel electrophoresis under reducing and nonreducing conditions, AFAL separated into two bands with a subunit molecular mass of 35 and 50 kDa. The homogeneity of purified protein was confirmed by chromatofocusing with a pI=4.0+/-0.5. Molecular exclusion chromatography confirmed time-dependent oligomerization in AFAL, in accordance with mass spectrometry analysis, which confers an alteration in AFAL affinity for chitin. The protein sequence was obtained by a liquid chromatography quadrupole time-of-flight experiment and showed that AFAL has 68% and 63% sequence similarity with lectins of Phaseolus vulgaris and Dolichos biflorus, respectively.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Papillomaviruses (PVs) infect a wide range of animal species and show great genetic diversity. To date, excluding equine sarcoids, only three species of PVs were identified associated with lesions in horses: Equus caballus papillomavirus 1 (EcPV1-cutaneous), EcPV2 (genital) and EcPV3 (aural plaques). In this study, we identified a novel equine PV from aural plaques, which we designated EcPV4. Cutaneous samples from horses with lesions that were microscopically diagnosed as aural plaques were subjected to DNA extraction, amplification and sequencing. Rolling circle amplification and inverse PCR with specific primers confirmed the presence of an approximately 8. kb circular genome. The full-length EcPV4 L1 major capsid protein sequence has 1488 nucleotides (495 amino acids). EcPV4 had a sequence identity of only 53.3%, 60.2% and 51.7% when compared with the published sequences for EcPV1, EcPV2 and EcPV3, respectively. A Bayesian phylogenetic analysis indicated that EcPV4 clusters with EcPV2, but not with EcPV1 and EcPV3. Using the current PV classification system that is based on the nucleotide sequence of L1, we could not define the genus of the newly identified virus. Therefore, a structural analysis of the L1 protein was carried out to aid in this classification because EcPV4 cause lesion similar to the lesion caused by EcPV3. A comparison of the superficial loops demonstrated a distinct amino acid conservation pattern between EcPV4/EcPV2 and EcPV4/EcPV3. These results demonstrate the presence of a new equine PV species and that structural studies could be useful in the classification of PVs. © 2012 Elsevier B.V.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Within about 30 years the Brazilian buffalo (Bubalus bubalis) herd will reach approximately 50 million head as a result of the great adaptive capacity of these animals to tropical climates, together with the good productive and reproductive potential which make these animals an important animal protein source for poor and developing countries. The myostatin gene (GDF8) is important in the physiology of stock animals because its product produces a direct effect on muscle development and consequently also on meat production. The myostatin sequence is known in several mammalian species and shows a high degree of amino acid sequence conservation, although the presence of non-silent and silent changes in the coding sequences and several alterations in the introns and untranslated regions have been identified. The objective of our work was to characterize the myostatin coding regions of B. bubalis (Murrah breed) and to compare them with the Bos taurus regions looking for variations in nucleotide and protein sequences. In this way, we were able to identify 12 variations at DNA level and five alterations on the presumed myostatin protein sequence as compared to non double-muscled bovine sequences.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The vast majority of known proteins have not yet been experimentally characterized and little is known about their function. The design and implementation of computational tools can provide insight into the function of proteins based on their sequence, their structure, their evolutionary history and their association with other proteins. Knowledge of the three-dimensional (3D) structure of a protein can lead to a deep understanding of its mode of action and interaction, but currently the structures of <1% of sequences have been experimentally solved. For this reason, it became urgent to develop new methods that are able to computationally extract relevant information from protein sequence and structure. The starting point of my work has been the study of the properties of contacts between protein residues, since they constrain protein folding and characterize different protein structures. Prediction of residue contacts in proteins is an interesting problem whose solution may be useful in protein folding recognition and de novo design. The prediction of these contacts requires the study of the protein inter-residue distances related to the specific type of amino acid pair that are encoded in the so-called contact map. An interesting new way of analyzing those structures came out when network studies were introduced, with pivotal papers demonstrating that protein contact networks also exhibit small-world behavior. In order to highlight constraints for the prediction of protein contact maps and for applications in the field of protein structure prediction and/or reconstruction from experimentally determined contact maps, I studied to which extent the characteristic path length and clustering coefficient of the protein contacts network are values that reveal characteristic features of protein contact maps. Provided that residue contacts are known for a protein sequence, the major features of its 3D structure could be deduced by combining this knowledge with correctly predicted motifs of secondary structure. In the second part of my work I focused on a particular protein structural motif, the coiled-coil, known to mediate a variety of fundamental biological interactions. Coiled-coils are found in a variety of structural forms and in a wide range of proteins including, for example, small units such as leucine zippers that drive the dimerization of many transcription factors or more complex structures such as the family of viral proteins responsible for virus-host membrane fusion. The coiled-coil structural motif is estimated to account for 5-10% of the protein sequences in the various genomes. Given their biological importance, in my work I introduced a Hidden Markov Model (HMM) that exploits the evolutionary information derived from multiple sequence alignments, to predict coiled-coil regions and to discriminate coiled-coil sequences. The results indicate that the new HMM outperforms all the existing programs and can be adopted for the coiled-coil prediction and for large-scale genome annotation. Genome annotation is a key issue in modern computational biology, being the starting point towards the understanding of the complex processes involved in biological networks. The rapid growth in the number of protein sequences and structures available poses new fundamental problems that still deserve an interpretation. Nevertheless, these data are at the basis of the design of new strategies for tackling problems such as the prediction of protein structure and function. Experimental determination of the functions of all these proteins would be a hugely time-consuming and costly task and, in most instances, has not been carried out. As an example, currently, approximately only 20% of annotated proteins in the Homo sapiens genome have been experimentally characterized. A commonly adopted procedure for annotating protein sequences relies on the "inheritance through homology" based on the notion that similar sequences share similar functions and structures. This procedure consists in the assignment of sequences to a specific group of functionally related sequences which had been grouped through clustering techniques. The clustering procedure is based on suitable similarity rules, since predicting protein structure and function from sequence largely depends on the value of sequence identity. However, additional levels of complexity are due to multi-domain proteins, to proteins that share common domains but that do not necessarily share the same function, to the finding that different combinations of shared domains can lead to different biological roles. In the last part of this study I developed and validate a system that contributes to sequence annotation by taking advantage of a validated transfer through inheritance procedure of the molecular functions and of the structural templates. After a cross-genome comparison with the BLAST program, clusters were built on the basis of two stringent constraints on sequence identity and coverage of the alignment. The adopted measure explicity answers to the problem of multi-domain proteins annotation and allows a fine grain division of the whole set of proteomes used, that ensures cluster homogeneity in terms of sequence length. A high level of coverage of structure templates on the length of protein sequences within clusters ensures that multi-domain proteins when present can be templates for sequences of similar length. This annotation procedure includes the possibility of reliably transferring statistically validated functions and structures to sequences considering information available in the present data bases of molecular functions and structures.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Different types of proteins exist with diverse functions that are essential for living organisms. An important class of proteins is represented by transmembrane proteins which are specifically designed to be inserted into biological membranes and devised to perform very important functions in the cell such as cell communication and active transport across the membrane. Transmembrane β-barrels (TMBBs) are a sub-class of membrane proteins largely under-represented in structure databases because of the extreme difficulty in experimental structure determination. For this reason, computational tools that are able to predict the structure of TMBBs are needed. In this thesis, two computational problems related to TMBBs were addressed: the detection of TMBBs in large datasets of proteins and the prediction of the topology of TMBB proteins. Firstly, a method for TMBB detection was presented based on a novel neural network framework for variable-length sequence classification. The proposed approach was validated on a non-redundant dataset of proteins. Furthermore, we carried-out genome-wide detection using the entire Escherichia coli proteome. In both experiments, the method significantly outperformed other existing state-of-the-art approaches, reaching very high PPV (92%) and MCC (0.82). Secondly, a method was also introduced for TMBB topology prediction. The proposed approach is based on grammatical modelling and probabilistic discriminative models for sequence data labeling. The method was evaluated using a newly generated dataset of 38 TMBB proteins obtained from high-resolution data in the PDB. Results have shown that the model is able to correctly predict topologies of 25 out of 38 protein chains in the dataset. When tested on previously released datasets, the performances of the proposed approach were measured as comparable or superior to the current state-of-the-art of TMBB topology prediction.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Phosphatidylinositol transfer proteins (PI-TP's) catalyze the transfer of phosphatidylinositol and phosphatidylcholine between membranes in vitro. However the in vivo function of these proteins is unknown. In this thesis we have used a combined biochemical and genetic approach to determine the importance of PI-TP in vivo. An oligonucleotide based on the amino terminal sequence of the PI-TP from Saccharomyces cerevisiae, was used to screen a yeast genomic library for the gene encoding PI-TP (PIT1 gene). Yeast strains transformed with the positive clones showed overproduction of transfer activities and transfer protein in the 100,000 x g supernatants. The 5$\sp\prime$ terminus of the PIT1 gene correlates with the predicted codons for residues 3-30 of the determined protein sequence. Tetrad analysis of a heterozygous diploid (PIT1/pit1::LEU2) revealed that the PIT1 gene is essential for cell growth. Non-viable spores could be rescued by transformation of the above diploid prior to sporulation, with a plasmid borne copy of the wild type gene. Sequencing of the entire PIT1 gene has revealed that the PIT1 gene is identical to the SEC14 gene. The sec14 ts mutant which exhibits conditional defects at the Golgi stage of protein secretion, is also temperature sensitive for PI-TP activity in vitro. These findings represent the first instance in which a physiological function has been assigned to any phospholipid transfer protein. ^

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The focus of this thesis lies in the development of a sensitive method for the analysis of protein primary structure which can be easily used to confirm the DNA sequence of a protein's gene and determine the modifications which are made after translation. This technique involves the use of dipeptidyl aminopeptidase (DAP) and dipeptidyl carboxypeptidase (DCP) to hydrolyze the protein and the mass spectrometric analysis of the dipeptide products.^ Dipeptidyl carboxypeptidase was purified from human lung tissue and characterized with respect to its proteolytic activity. The results showed that the enzyme has a relatively unrestricted specificity, making it useful for the analysis of the C-terminal of proteins. Most of the dipeptide products were identified using gas chromatography/mass spectrometry (GC/MS). In order to analyze the peptides not hydrolyzed by DCP and DAP, as well as the dipeptides not identified by GC/MS, a FAB ion source was installed on a quadrupole mass spectrometer and its performance evaluated with a variety of compounds.^ Using these techniques, the sequences of the N-terminal and C-terminal regions and seven fragments of bacteriophage P22 tail protein have been verified. All of the dipeptides identified in these analysis were in the same DNA reading frame, thus ruling out the possibility of a single base being inserted or deleted from the DNA sequence. The verification of small sequences throughout the protein sequence also indicates that no large portions of the protein have been removed after translation. ^

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Lodestar, a Drosophila maternal-effect gene, is essential for proper chromosome segregation during embryonic mitosis. Mutations in lodestar cause chromatin bridging in anaphase, preventing the sister chromatids from fully separating and leaving chromatin tangled at the metaphase plate. Drosophila lodestar protein was originally identified, in purified fractions of Drosophila Kc cell nuclear extracts, by its ability to suppress the generation of long RNA polymerase II transcripts. The human homolog of this protein (hLodestar) was cloned and studied in comparison to the Drosophila lodestar activities. The results of these studies show, similar to the Drosophila protein, hLodestar has dsDNA-dependent ATPase and transcription termination activity in vitro. hLodestar has also been shown to release RNA polymerase I and II stalled at a cyclobutane thymine dimer. Lodestar belongs to the SNF2 family of proteins, which are members of the DExH/D helicase super-family. The SNF2 family of proteins are believed to play a critical role in altering protein-DNA interactions in a variety of cellular contexts. We have recently isolated a human cDNA (hLodestar) that shares significant homology to the Drosophila lodestar gene. The 4.6 kb clone contains an open reading frame of 1162 amino acids, and shares 55% similarity and 46% identity to the Drosophila Lodestar protein sequence. Our studies looking for hLodestar interacting proteins revealed an association with CDC5L in the yeast two-hybrid system and co-immunoprecipitation experiments. CDC5L has been well documented to be a component of the spliceosome. Our data suggests hLodestar is involved in splicing through in vitro assembly and splicing reactions, in addition to its association with spliceosomes purified from HeLa nuclear extract. Although many other members of the DExH/D helicase super-family have been linked to splicing, this is the first SNF2 family member to be implicated in the splicing reaction. ^