Cluster identification and separation in the growing self-organizing map: application in protein sequence classification


Autoria(s): Ahmad, Norashikin; Alahakoon, Damminda; Chau, Rowena
Data(s)

01/06/2010

Resumo

Growing self-organizing map (GSOM) has been introduced as an improvement to the self-organizing map (SOM) algorithm in clustering and knowledge discovery. Unlike the traditional SOM, GSOM has a dynamic structure which allows nodes to grow reflecting the knowledge discovered from the input data as learning progresses. The spread factor parameter (SF) in GSOM can be utilized to control the spread of the map, thus giving an analyst a flexibility to examine the clusters at different granularities. Although GSOM has been applied in various areas and has been proven effective in knowledge discovery tasks, no comprehensive study has been done on the effect of the spread factor parameter value to the cluster formation and separation. Therefore, the aim of this paper is to investigate the effect of the spread factor value towards cluster separation in the GSOM. We used simple k-means algorithm as a method to identify clusters in the GSOM. By using Davies–Bouldin index, clusters formed by different values of spread factor are obtained and the resulting clusters are analyzed. In this work, we show that clusters can be more separated when the spread factor value is increased. Hierarchical clusters can then be constructed by mapping the GSOM clusters at different spread factor values.

Identificador

http://hdl.handle.net/10536/DRO/DU:30064344

Idioma(s)

eng

Publicador

Springer-Verlag

Relação

http://dro.deakin.edu.au/eserv/DU:30064344/alahakoon-clusteridentification-2010.pdf

http://dx.doi.org/10.1007/s00521-009-0300-0

Direitos

2010, Springer

Palavras-Chave #Cluster identification #cluster separation #unsupervised neural networks #dynamic self-organizing map #protein sequence classification
Tipo

Journal Article