23 resultados para Multiple sequence alignment

em University of Queensland eSpace - Australia


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Five ripening-related ACC synthase cDNA isoforms were cloned from 80% ripe papaya cv. 'Sinta' by reverse transcription-PCR using gene-specific primers. Clone 2 had the longest transcript and contained all common exons and three alternative exons. Clones 3 and 4 contained common exons and one alternative exon each, while clone 1, the most common transcript, contained only the common exons. Clone 5 could be due to cloning artifacts and might not be a unique cDNA fragment. Thus, there are only four isoforms of ACC synthase mRNA. Southern blot analysis indicates that all five clones came from only one gene existing as a single copy in the 'Sinta' papaya genome. Multiple sequence alignment indicates that the four isoforms arise from a single gene, possibly through alternative splicing mechanisms. All the putative alternative exons were present at the 5'-end of the gene comprising the N-terminal region of the protein. 'Sinta' ACC synthase cDNAs were of the capacs 1 type and are most closely related to a 1.4 kb capacs 1-type DNA (AJ277160) from Eksotika papaya. No capacs 2-type cDNAs were cloned from 'Sinta' by RT-PCR. This is the first report of possible alternative splicing mechanism in ripening-related ACC synthase genes in hybrid papaya, possibly to modulate or fine-tune gene expression relevant to fruit ripening.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Background: Protein tertiary structure can be partly characterized via each amino acid's contact number measuring how residues are spatially arranged. The contact number of a residue in a folded protein is a measure of its exposure to the local environment, and is defined as the number of C-beta atoms in other residues within a sphere around the C-beta atom of the residue of interest. Contact number is partly conserved between protein folds and thus is useful for protein fold and structure prediction. In turn, each residue's contact number can be partially predicted from primary amino acid sequence, assisting tertiary fold analysis from sequence data. In this study, we provide a more accurate contact number prediction method from protein primary sequence. Results: We predict contact number from protein sequence using a novel support vector regression algorithm. Using protein local sequences with multiple sequence alignments (PSI-BLAST profiles), we demonstrate a correlation coefficient between predicted and observed contact numbers of 0.70, which outperforms previously achieved accuracies. Including additional information about sequence weight and amino acid composition further improves prediction accuracies significantly with the correlation coefficient reaching 0.73. If residues are classified as being either contacted or non-contacted, the prediction accuracies are all greater than 77%, regardless of the choice of classification thresholds. Conclusion: The successful application of support vector regression to the prediction of protein contact number reported here, together with previous applications of this approach to the prediction of protein accessible surface area and B-factor profile, suggests that a support vector regression approach may be very useful for determining the structure-function relation between primary sequence and higher order consecutive protein structural and functional properties.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

A 16S rRNA gene database (http://greengenes.bl.gov) addresses limitations of public repositories by providing chimera screening, standard alignment, and taxonomic classification using multiple published taxonomies. It was found that there is incongruent taxonomic nomenclature among curators even at the phylum level. Putative chimeras were identified in 3% of environmental sequences and in 0.2% of records derived from isolates. Environmental sequences were classified into 100 phylum-level lineages in the Archaea and Bacteria.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

At present, little is known about signal transduction mechanisms in schistosomes, which cause the disease of schistosomiasis. The mitogen-activated protein kinase (MAPK) signaling pathways, which are evolutionarily conserved from yeast to Homo sapiens, play key roles in multiple cellular processes. Here, we reconstructed the hypothetical MAPK signaling pathways in Schistosoma japonicum and compared the schistosome pathways with those of model eukaryote species. We identified 60 homologous components in the S. japoncium MAPK signaling pathways. Among these, 27 were predicted to be full-length sequences. Phylogenetic analysis of these proteins confirmed the evolutionary conservation of the MAPK signaling pathways. Remarkably, we identified S. japonicum homologues of GTP-binding protein beta and alpha-I subunits in the yeast mating pathway, which might be involved in the regulation of different life stages and female sexual maturation processes as well in schistosomes. In addition, several pathway member genes, including ERK, JNK, Sja-DSP, MRAS and RAS, were determined through quantitative PCR analysis to be expressed in a stage-specific manner, with ERK, JNK and their inhibitor Sja-DSP markedly upregulated in adult female schistosomes. (c) 2006 Federation of European Biochemical Societies. Published by Elsevier B.V. All rights reserved.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

We describe a new method for using neural networks to predict residue contact pairs in a protein. The main inputs to the neural network are a set of 25 measures of correlated mutation between all pairs of residues in two windows of size 5 centered on the residues of interest. While the individual pair-wise correlations are a relatively weak predictor of contact, by training the network on windows of correlation the accuracy of prediction is significantly improved. The neural network is trained on a set of 100 proteins and then tested on a disjoint set of 1033 proteins of known structure. An average predictive accuracy of 21.7% is obtained taking the best L/2 predictions for each protein, where L is the sequence length. Taking the best L/10 predictions gives an average accuracy of 30.7%. The predictor is also tested on a set of 59 proteins from the CASP5 experiment. The accuracy is found to be relatively consistent across different sequence lengths, but to vary widely according to the secondary structure. Predictive accuracy is also found to improve by using multiple sequence alignments containing many sequences to calculate the correlations. (C) 2004 Wiley-Liss, Inc.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Potato type II serine proteinase inhibitors are proteins that consist of multiple sequence repeats, and exhibit a multidomain structure. The structural domains are circular permutations of the repeat sequence.. as a result or intramolecular domain swapping. Structural studies give indications for the origins of this folding behaviour, and the evolution of the inhibitor family.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Structural similarity among proteins is reflected in the distribution of hydropathicity along the amino acids in the protein sequence. Similarities in the hydropathy distributions are obvious for homologous proteins within a protein family. They also were observed for proteins with related structures, even when sequence similarities were undetectable. Here we present a novel method that employs the hydropathy distribution in proteins for identification of (sub)families in a set of (homologous) proteins. We represent proteins as points in a generalized hydropathy space, represented by vectors of specifically defined features. The features are derived from hydropathy of the individual amino acids. Projection of this space onto principal axes reveals groups of proteins with related hydropathy distributions. The groups identified correspond well to families of structurally and functionally related proteins. We found that this method accurately identifies protein families in a set of proteins, or subfamilies in a set of homologous proteins. Our results show that protein families can be identified by the analysis of hydropathy distribution, without the need for sequence alignment. (C) 2005 Wiley-Liss, Inc.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Background: The residue-wise contact order (RWCO) describes the sequence separations between the residues of interest and its contacting residues in a protein sequence. It is a new kind of one-dimensional protein structure that represents the extent of long-range contacts and is considered as a generalization of contact order. Together with secondary structure, accessible surface area, the B factor, and contact number, RWCO provides comprehensive and indispensable important information to reconstructing the protein three-dimensional structure from a set of one-dimensional structural properties. Accurately predicting RWCO values could have many important applications in protein three-dimensional structure prediction and protein folding rate prediction, and give deep insights into protein sequence-structure relationships. Results: We developed a novel approach to predict residue-wise contact order values in proteins based on support vector regression (SVR), starting from primary amino acid sequences. We explored seven different sequence encoding schemes to examine their effects on the prediction performance, including local sequence in the form of PSI-BLAST profiles, local sequence plus amino acid composition, local sequence plus molecular weight, local sequence plus secondary structure predicted by PSIPRED, local sequence plus molecular weight and amino acid composition, local sequence plus molecular weight and predicted secondary structure, and local sequence plus molecular weight, amino acid composition and predicted secondary structure. When using local sequences with multiple sequence alignments in the form of PSI-BLAST profiles, we could predict the RWCO distribution with a Pearson correlation coefficient (CC) between the predicted and observed RWCO values of 0.55, and root mean square error (RMSE) of 0.82, based on a well-defined dataset with 680 protein sequences. Moreover, by incorporating global features such as molecular weight and amino acid composition we could further improve the prediction performance with the CC to 0.57 and an RMSE of 0.79. In addition, combining the predicted secondary structure by PSIPRED was found to significantly improve the prediction performance and could yield the best prediction accuracy with a CC of 0.60 and RMSE of 0.78, which provided at least comparable performance compared with the other existing methods. Conclusion: The SVR method shows a prediction performance competitive with or at least comparable to the previously developed linear regression-based methods for predicting RWCO values. In contrast to support vector classification (SVC), SVR is very good at estimating the raw value profiles of the samples. The successful application of the SVR approach in this study reinforces the fact that support vector regression is a powerful tool in extracting the protein sequence-structure relationship and in estimating the protein structural profiles from amino acid sequences.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Scorpion toxins are important physiological probes for characterizing ion channels. Molecular databases have limited functional annotation of scorpion toxins. Their function can be inferred by searching for conserved motifs in sequence signature databases that are derived statistically but are not necessarily biologically relevant. Mutation studies provide biological information on residues and positions important for structure-function relationship but are not normally used for extraction of binding motifs. 3D structure analyses also aid in the extraction of peptide motifs in which non-contiguous residues are clustered spatially. Here we present new, functionally relevant peptide motifs for ion channels, derived from the analyses of scorpion toxin native and mutant peptides. Copyright (c) 2006 European Peptide Society and John Wiley & Sons, Ltd.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

In this study, we propose a novel method to predict the solvent accessible surface areas of transmembrane residues. For both transmembrane alpha-helix and beta-barrel residues, the correlation coefficients between the predicted and observed accessible surface areas are around 0.65. On the basis of predicted accessible surface areas, residues exposed to the lipid environment or buried inside a protein can be identified by using certain cutoff thresholds. We have extensively examined our approach based on different definitions of accessible surface areas and a variety of sets of control parameters. Given that experimentally determining the structures of membrane proteins is very difficult and membrane proteins are actually abundant in nature, our approach is useful for theoretically modeling membrane protein tertiary structures, particularly for modeling the assembly of transmembrane domains. This approach can be used to annotate the membrane proteins in proteomes to provide extra structural and functional information.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This study demonstrates the effectiveness of a novel self-adjuvanting vaccine delivery system for multiple different synthetic peptide immunogens by use of lipid core peptide (LCP) technology. An LCP formulation incorporating two different protective epitopes of the surface antiphagocytic M protein of group A streptococci (GAS)-the causative agents of rheumatic fever and subsequent rheumatic heart disease-was tested in a murine parenteral immunization and GAS challenge model. Mice were immunized with the LCP-GAS formulation, which contains an M protein amino-terminal type-specific peptide sequence (8830) in combination with a conserved non-host-cross-reactive carboxy-terminal C-region peptide sequence (J8) of the M protein. Our data demonstrated immunogenicity of the LCP-8830-J8 formulation in B10.BR mice when coadministered in complete Freund's adjuvant and in the absence of a conventional adjuvant. In both cases, immunization led to induction of high-titer GAS peptide-specific serum immunoglobulin G antibody responses and induction of highly opsonic antibodies that did not cross-react with human heart tissue proteins. Moreover, mice were completely protected from GAS infection when immunized with LCP-8830-J8 in the presence or absence of a conventional adjuvant. Mice were not protected, however, following immunization with an LCP formulation containing a control peptide from a Schistosoma sp. These data support the potential of LCP technology in the development of novel self-adjuvanting multi-antigen component vaccines and point to the potential application of this system in the development of human vaccines against infectious diseases.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Tartrate-resistant acid phosphatase (TRAP) is highly expressed in osteoclasts and in a subset of tissue macrophages and dendritic cells. It is expressed at lower levels in the parenchymal cells of the liver, glomerular mesangial cells of the kidney and pancreatic acinar cells. We have identified novel TRAP mRNAs that differ in their 5-untranslated region (5'-UTR) sequence, but align with the known murine TRAP mRNA from the first base of Exon 2. The novel 5'-UTRs represent alternative first exons located upstream of the known 5'-UTR. A similar genomic structure exists for the human TRAP gene with partial conservation of the exon and promoter sequences. Expression of the most distal 5'-UTR (Exon 1A) is restricted to adult bone and spleen tissue. Exon 1B is expressed primarily in tissues containing TRAP-positive nonhaematopoietic cells. The known TRAP 5'-UTR (Exon 1) is expressed in tissues characteristic of myeloid cell expression. In addition the Exon 1C promoter sequence is shown to comprise distinct transcription start regions, with an osteoclast-specific transcription initiation site identified downstream of a TATA-like element. Macrophages are shown to initiate transcription of the Exon 1C transcript from a purine-rich region located upstream of the osteoclast-specific transcription start point. The distinct expression patterns for each of the TRAP 5'-UTRs suggest that TRAP mRNA expression is regulated by the use of four alternative tissue- and cell-restricted promoters. (C) 2003 Elsevier Science B.V. All rights reserved.