Exploiting sequence dependencies in the prediction of peroxisomal proteins


Autoria(s): Wakabayashi, M.; Hawkins, J. C.; Maetschke, S. R.; Boden, M. B.
Contribuinte(s)

M. Gallagher

J. Hogan

F. Maire

Data(s)

01/01/2005

Resumo

Prediction of peroxisomal matrix proteins generally depends on the presence of one of two distinct motifs at the end of the amino acid sequence. PTS1 peroxisomal proteins have a well conserved tripeptide at the C-terminal end. However, the preceding residues in the sequence arguably play a crucial role in targeting the protein to the peroxisome. Previous work in applying machine learning to the prediction of peroxisomal matrix proteins has failed W capitalize on the full extent of these dependencies. We benchmark a range of machine learning algorithms, and show that a classifier - based on the Support Vector Machine - produces more accurate results when dependencies between the conserved motif and the preceding section are exploited. We publish an updated and rigorously curated data set that results in increased prediction accuracy of most tested models.

Identificador

http://espace.library.uq.edu.au/view/UQ:102605

Idioma(s)

eng

Publicador

Springer-Verlag

Palavras-Chave #E1 #280207 Pattern Recognition #780101 Mathematical sciences
Tipo

Conference Paper