Biblioteca Digital

Supplementary information : weighted tree kernels for sequence analysis

**Autoria(s):** Bowles, Christopher J.; Hogan, James M.
Data(s)	23/04/2014
Resumo	Genomic sequences are fundamentally text documents, admitting various representations according to need and tokenization. Gene expression depends crucially on binding of enzymes to the DNA sequence at small, poorly conserved binding sites, limiting the utility of standard pattern search. However, one may exploit the regular syntactic structure of the enzyme's component proteins and the corresponding binding sites, framing the problem as one of detecting grammatically correct genomic phrases. In this paper we propose new kernels based on weighted tree structures, traversing the paths within them to capture the features which underpin the task. Experimentally, we and that these kernels provide performance comparable with state of the art approaches for this problem, while offering significant computational advantages over earlier methods. The methods proposed may be applied to a broad range of sequence or tree-structured data in molecular biology and other domains.
Formato	application/pdf
Identificador	http://eprints.qut.edu.au/67877/
Publicador	Springer Verlag
Relação	http://eprints.qut.edu.au/67877/1/BowlesHoganSupp.pdf Bowles, Christopher J. & Hogan, James M. (2014) Supplementary information : weighted tree kernels for sequence analysis. Proceedings of ESANN 2014. (In Press)
Direitos	Copyright 2014 Please consult the authors
Fonte	School of Electrical Engineering & Computer Science; Science & Engineering Faculty
Palavras-Chave	#060102 Bioinformatics #080100 ARTIFICIAL INTELLIGENCE AND IMAGE PROCESSING #080301 Bioinformatics Software #Bioinformatics #Kernel methods #Machine learning #Genomics
Tipo	Other

Acesso ao item digital