Synthesizing Regularity Exposing Attributes in Large Protein Databases


Autoria(s): de la Maza, Michael
Data(s)

20/10/2004

20/10/2004

01/05/1993

Resumo

This thesis describes a system that synthesizes regularity exposing attributes from large protein databases. After processing primary and secondary structure data, this system discovers an amino acid representation that captures what are thought to be the three most important amino acid characteristics (size, charge, and hydrophobicity) for tertiary structure prediction. A neural network trained using this 16 bit representation achieves a performance accuracy on the secondary structure prediction problem that is comparable to the one achieved by a neural network trained using the standard 24 bit amino acid representation. In addition, the thesis describes bounds on secondary structure prediction accuracy, derived using an optimal learning algorithm and the probably approximately correct (PAC) model.

Formato

90 p.

204397 bytes

794429 bytes

application/octet-stream

application/pdf

Identificador

AITR-1444

http://hdl.handle.net/1721.1/6789

Idioma(s)

en_US

Relação

AITR-1444

Palavras-Chave #representation reformulation #secondary structuresprediction #genetic algorithms #neural networks #clustering algorithm #sdecision tree systems