Supplementary information : weighted tree kernels for sequence analysis
Data(s) |
23/04/2014
|
---|---|
Resumo |
Genomic sequences are fundamentally text documents, admitting various representations according to need and tokenization. Gene expression depends crucially on binding of enzymes to the DNA sequence at small, poorly conserved binding sites, limiting the utility of standard pattern search. However, one may exploit the regular syntactic structure of the enzyme's component proteins and the corresponding binding sites, framing the problem as one of detecting grammatically correct genomic phrases. In this paper we propose new kernels based on weighted tree structures, traversing the paths within them to capture the features which underpin the task. Experimentally, we and that these kernels provide performance comparable with state of the art approaches for this problem, while offering significant computational advantages over earlier methods. The methods proposed may be applied to a broad range of sequence or tree-structured data in molecular biology and other domains. |
Formato |
application/pdf |
Identificador | |
Publicador |
Springer Verlag |
Relação |
http://eprints.qut.edu.au/67877/1/BowlesHoganSupp.pdf Bowles, Christopher J. & Hogan, James M. (2014) Supplementary information : weighted tree kernels for sequence analysis. Proceedings of ESANN 2014. (In Press) |
Direitos |
Copyright 2014 Please consult the authors |
Fonte |
School of Electrical Engineering & Computer Science; Science & Engineering Faculty |
Palavras-Chave | #060102 Bioinformatics #080100 ARTIFICIAL INTELLIGENCE AND IMAGE PROCESSING #080301 Bioinformatics Software #Bioinformatics #Kernel methods #Machine learning #Genomics |
Tipo |
Other |