On the simultaneous use of clinical and microarray expression data in the cluster analysis of tissue samples


Autoria(s): McLachlan, G. J.; Chang, S.; Mar, J.; Ambroise, C.
Contribuinte(s)

Yi-Ping Phoebe Chen

Data(s)

01/01/2004

Resumo

This paper considers a model-based approach to the clustering of tissue samples of a very large number of genes from microarray experiments. It is a nonstandard problem in parametric cluster analysis because the dimension of the feature space (the number of genes) is typically much greater than the number of tissues. Frequently in practice, there are also clinical data available on those cases on which the tissue samples have been obtained. Here we investigate how to use the clinical data in conjunction with the microarray gene expression data to cluster the tissue samples. We propose two mixture model-based approaches in which the number of components in the mixture model corresponds to the number of clusters to be imposed on the tissue samples. One approach specifies the components of the mixture model to be the conditional distributions of the microarray data given the clinical data with the mixing proportions also conditioned on the latter data. Another takes the components of the mixture model to represent the joint distributions of the clinical and microarray data. The approaches are demonstrated on some breast cancer data, as studied recently in van't Veer et al. (2002).

Identificador

http://espace.library.uq.edu.au/view/UQ:100062

Idioma(s)

eng

Publicador

Australian Computer Society

Palavras-Chave #Microarrays #Gene expressions #Mixture modelling #Cluster analysis #Clinical data #E1 #230204 Applied Statistics #780101 Mathematical sciences
Tipo

Conference Paper