离群模糊核聚类算法


Autoria(s): 沈红斌; 王士同; 吴小俊
Data(s)

2004

Resumo

一般说来,离群点是远离其他数据点的数据,但很可能包含着极其重要的信息.提出了一种新的离群模糊核聚类算法来发现样本集中的离群点.通过Mercer核把原来的数据空间映射到特征空间,并为特征空间的每个向量分配一个动态权值,在经典的FCM模糊聚类算法的基础上得到了一个特征空间内的全新的聚类目标函数,通过对目标函数的优化,最终得到了各个数据的权值,根据权值的大小标识出样本集中的离群点.仿真实验的结果表明了该离群模糊核聚类算法的可行性和有效性.

Outliers are data values that lie away from the general clusters of other data values. It may be that an outlier implies the most important feature of a dataset. In this paper, a new fuzzy kernel clustering algorithm is presented to locate the critical areas that are often represented by only a few outliers. Through mercer kernel functions, the data in the original space are firstly mapped to a high-dimensional feature space. Then a modified objective function for fuzzy clustering is introduced in the feature space. An additional weighting factor is assigned to each vector in the feature space, and the weight value is updated using the iterative functions derived from the objective function. The final weight of a datum represents a kind of representativeness of the corresponding datum. With these weights, the experts can identify the outliers easily. The simulations demonstrate the feasibility of this method.

江苏省计算机信息技术重点实验室开放课题;;南京大学计算机软件新技术国家重点实验室开放课题;;江苏省自然科学基金~~

Identificador

http://ir.sia.ac.cn//handle/173321/3043

http://www.irgrid.ac.cn/handle/1471x/171714

Idioma(s)

中文

Palavras-Chave #离群 #模糊 #核函数 #特征空间 #聚类算法
Tipo

期刊论文