Feature selection environment for genomic applications


Autoria(s): LOPES, Fabricio Martins; MARTINS JR., David Correa; CESAR JR., Roberto M.
Contribuinte(s)

UNIVERSIDADE DE SÃO PAULO

Data(s)

19/04/2012

19/04/2012

2008

Resumo

Background: Feature selection is a pattern recognition approach to choose important variables according to some criteria in order to distinguish or explain certain phenomena (i.e., for dimensionality reduction). There are many genomic and proteomic applications that rely on feature selection to answer questions such as selecting signature genes which are informative about some biological state, e. g., normal tissues and several types of cancer; or inferring a prediction network among elements such as genes, proteins and external stimuli. In these applications, a recurrent problem is the lack of samples to perform an adequate estimate of the joint probabilities between element states. A myriad of feature selection algorithms and criterion functions have been proposed, although it is difficult to point the best solution for each application. Results: The intent of this work is to provide an open-source multiplataform graphical environment for bioinformatics problems, which supports many feature selection algorithms, criterion functions and graphic visualization tools such as scatterplots, parallel coordinates and graphs. A feature selection approach for growing genetic networks from seed genes ( targets or predictors) is also implemented in the system. Conclusion: The proposed feature selection environment allows data analysis using several algorithms, criterion functions and graphic visualization tools. Our experiments have shown the software effectiveness in two distinct types of biological problems. Besides, the environment can be used in different pattern recognition applications, although the main concern regards bioinformatics tasks.

FAPESP

CNPq

Coordenacao de Aperfeicoamento de Pessoal de Nivel Superior (CAPES)

Identificador

BMC BIOINFORMATICS, LONDON, v.9, OCT 22, 2008

1471-2105

http://producao.usp.br/handle/BDPI/16658

10.1186/1471-2105-9-451

http://dx.doi.org/10.1186/1471-2105-9-451

Idioma(s)

eng

Publicador

BIOMED CENTRAL LTD

LONDON

Relação

BMC Bioinformatics

Direitos

openAccess

Copyright BIOMED CENTRAL LTD

Palavras-Chave #COEFFICIENT #Biochemical Research Methods #Biotechnology & Applied Microbiology #Mathematical & Computational Biology
Tipo

article

original article

publishedVersion