18 resultados para sequence similarity searches


Relevância:

30.00% 30.00%

Publicador:

Resumo:

The recently described cupin superfamily of proteins includes the germin and germinlike proteins, of which the cereal oxalate oxidase is the best characterized. This superfamily also includes seed storage proteins, in addition to several microbial enzymes and proteins with unknown function. All these proteins are characterized by the conservation of two central motifs, usually containing two or three histidine residues presumed to be involved with metal binding in the catalytic active site. The present study on the coding regions of Synechocystis PCC6803 identifies a previously unknown group of 12 related cupins, each containing the characteristic two-motif signature. This group comprises 11 single-domain proteins, ranging in length from 104 to 289 residues, and includes two phosphomannose isomerases and two epimerases involved in cell wall synthesis, a member of the pirin group of nuclear proteins, a possible transcriptional regulator, and a close relative-of a cytochrome c551 from Rhodococcus. Additionally, there is a duplicated, two-domain protein that has close similarity to an oxalate decarboxylase from the fungus Collybia velutipes and that is a putative progenitor of the storage proteins of land plants.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Motivation: The ability of a simple method (MODCHECK) to determine the sequence–structure compatibility of a set of structural models generated by fold recognition is tested in a thorough benchmark analysis. Four Model Quality Assessment Programs (MQAPs) were tested on 188 targets from the latest LiveBench-9 automated structure evaluation experiment. We systematically test and evaluate whether the MQAP methods can successfully detect native-likemodels. Results: We show that compared with the other three methods tested MODCHECK is the most reliable method for consistently performing the best top model selection and for ranking the models. In addition, we show that the choice of model similarity score used to assess a model's similarity to the experimental structure can influence the overall performance of these tools. Although these MQAP methods fail to improve the model selection performance for methods that already incorporate protein three dimension (3D) structural information, an improvement is observed for methods that are purely sequence-based, including the best profile–profile methods. This suggests that even the best sequence-based fold recognition methods can still be improved by taking into account the 3D structural information.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Motivation: In order to enhance genome annotation, the fully automatic fold recognition method GenTHREADER has been improved and benchmarked. The previous version of GenTHREADER consisted of a simple neural network which was trained to combine sequence alignment score, length information and energy potentials derived from threading into a single score representing the relationship between two proteins, as designated by CATH. The improved version incorporates PSI-BLAST searches, which have been jumpstarted with structural alignment profiles from FSSP, and now also makes use of PSIPRED predicted secondary structure and bi-directional scoring in order to calculate the final alignment score. Pairwise potentials and solvation potentials are calculated from the given sequence alignment which are then used as inputs to a multi-layer, feed-forward neural network, along with the alignment score, alignment length and sequence length. The neural network has also been expanded to accommodate the secondary structure element alignment (SSEA) score as an extra input and it is now trained to learn the FSSP Z-score as a measurement of similarity between two proteins. Results: The improvements made to GenTHREADER increase the number of remote homologues that can be detected with a low error rate, implying higher reliability of score, whilst also increasing the quality of the models produced. We find that up to five times as many true positives can be detected with low error rate per query. Total MaxSub score is doubled at low false positive rates using the improved method.