Partitions selection strategy for set of clustering solutions
Contribuinte(s) |
UNIVERSIDADE DE SÃO PAULO |
---|---|
Data(s) |
20/10/2012
20/10/2012
2010
|
Resumo |
Clustering is a difficult task: there is no single cluster definition and the data can have more than one underlying structure. Pareto-based multi-objective genetic algorithms (e.g., MOCK Multi-Objective Clustering with automatic K-determination and MOCLE-Multi-Objective Clustering Ensemble) were proposed to tackle these problems. However, the output of such algorithms can often contains a high number of partitions, becoming difficult for an expert to manually analyze all of them. In order to deal with this problem, we present two selection strategies, which are based on the corrected Rand, to choose a subset of solutions. To test them, they are applied to the set of solutions produced by MOCK and MOCLE in the context of several datasets. The study was also extended to select a reduced set of partitions from the initial population of MOCLE. These analysis show that both versions of selection strategy proposed are very effective. They can significantly reduce the number of solutions and, at the same time, keep the quality and the diversity of the partitions in the original set of solutions. (C) 2010 Elsevier B.V. All rights reserved. |
Identificador |
NEUROCOMPUTING, v.73, n.16-18, Special Issue, p.2809-2819, 2010 0925-2312 http://producao.usp.br/handle/BDPI/28751 10.1016/j.neucom.2010.03.028 |
Idioma(s) |
eng |
Publicador |
ELSEVIER SCIENCE BV |
Relação |
Neurocomputing |
Direitos |
restrictedAccess Copyright ELSEVIER SCIENCE BV |
Palavras-Chave | #Clustering #Model selection #GENE-EXPRESSION SIGNATURES #MOLECULAR CLASSIFICATION #MICROARRAY DATA #CLASS DISCOVERY #CANCER #PREDICTION #VALIDATION #CARCINOMAS #LEUKEMIA #Computer Science, Artificial Intelligence |
Tipo |
article proceedings paper publishedVersion |