Data mining the yeast genome in a lazy functional language


Autoria(s): Clare, Amanda; King, Ross Donald
Contribuinte(s)

Department of Computer Science

Bioinformatics and Computational Biology Group

Data(s)

24/04/2006

24/04/2006

2003

Resumo

Clare, A. and King R.D. (2003) Data mining the yeast genome in a lazy functional language. In Practical Aspects of Declarative Languages (PADL'03) (won Best/Most Practical Paper award).

Critics of lazy functional languages contend that the languages are only suitable for toy problems and are not used for real systems. We present an application (PolyFARM) for distributed data mining in relational bioinformatics data, written in the lazy functional language Haskell. We describe the problem we wished to solve, the reasons we chose Haskell and relate our experiences. Laziness did cause many problems in controlling heap space usage, but these were solved by a variety of methods. The many advantages of writing software in Haskell outweighed these problems. These included clear expression of algorithms, good support for data structures, abstraction, modularity and generalisation leading to fast prototyping and code reuse, parsing tools, profiling tools, language features such as strong typing and referential transparency, and the support of an enthusiastic Haskell community. PolyFARM is currently in use mining data from the Saccharomyces cerevisiae genome and is freely available for non-commercial use at http://www.aber.ac.uk/compsci/Research/bio/dss/polyfarm/.

Non peer reviewed

Identificador

Clare , A & King , R D 2003 , ' Data mining the yeast genome in a lazy functional language ' .

PURE: 68087

PURE UUID: e0f72118-1fc1-49d4-a371-089484a62c67

dspace: 2160/130

http://hdl.handle.net/2160/130

Idioma(s)

eng

Tipo

/dk/atira/pure/researchoutput/researchoutputtypes/contributiontoconference/paper

Direitos