The full Bayesian significance test for mixture models: results in gene expression clustering


Autoria(s): LAURETTO, M. S.; PEREIRA, C. A. B.; STERN, J. M.
Contribuinte(s)

UNIVERSIDADE DE SÃO PAULO

Data(s)

17/04/2012

17/04/2012

2008

Resumo

Gene clustering is a useful exploratory technique to group together genes with similar expression levels under distinct cell cycle phases or distinct conditions. It helps the biologist to identify potentially meaningful relationships between genes. In this study, we propose a clustering method based on multivariate normal mixture models, where the number of clusters is predicted via sequential hypothesis tests: at each step, the method considers a mixture model of m components (m = 2 in the first step) and tests if in fact it should be m - 1. If the hypothesis is rejected, m is increased and a new test is carried out. The method continues (increasing m) until the hypothesis is accepted. The theoretical core of the method is the full Bayesian significance test, an intuitive Bayesian approach, which needs no model complexity penalization nor positive probabilities for sharp hypotheses. Numerical experiments were based on a cDNA microarray dataset consisting of expression levels of 205 genes belonging to four functional categories, for 10 distinct strains of Saccharomyces cerevisiae. To analyze the method's sensitivity to data dimension, we performed principal components analysis on the original dataset and predicted the number of classes using 2 to 10 principal components. Compared to Mclust (model-based clustering), our method shows more consistent results.

Identificador

GENETICS AND MOLECULAR RESEARCH, v.7, n.3, p.883-897, 2008

1676-5680

http://producao.usp.br/handle/BDPI/14572

http://www.geneticsmr.com//year2008/vol7-3/pdf/x-meeting06.pdf

Idioma(s)

eng

Publicador

FUNPEC-EDITORA

Relação

Genetics and Molecular Research

Direitos

openAccess

Copyright FUNPEC-EDITORA

Palavras-Chave #Gene clustering #Mixture models #Significance test #Expression data analysis #EM ALGORITHM #Biochemistry & Molecular Biology #Genetics & Heredity
Tipo

article

proceedings paper

publishedVersion