计算机科学与探索 ›› 2014, Vol. 8 ›› Issue (11): 1365-1372.DOI: 10.3778/j.issn.1673-9418.1405038

• 人工智能与模式识别 • 上一篇    下一篇

基于有效距离的谱聚类算法

光俊叶1,刘明霞1,2,张道强1+   

  1. 1. 南京航空航天大学 计算机科学与技术学院,南京 210016
    2. 泰山学院 信息科学技术学院,山东 泰安 271021
  • 出版日期:2014-11-01 发布日期:2014-11-04

Spectral Clustering Algorithm Based on Effective Distance

GUANG Junye1, LIU Mingxia1,2, ZHANG Daoqiang1+   

  1. 1. College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing 210016, China
    2. College of Information Science and Technology, Taishan University, Taian, Shandong 271021,China
  • Online:2014-11-01 Published:2014-11-04

摘要: 在现有多种距离度量和传统谱聚类算法的基础上,提出了一种新的基于有效距离的谱聚类算法(spectral clustering based on effective distance,SCED)。SCED算法通过稀疏重构系数来构建样本与样本之间的有效距离,从而代替传统谱聚类算法中的欧氏距离,进行样本之间的相似度评估。与传统距离度量相比,有效距离不仅利用了样本对之间的距离信息,同时考虑了目标样本与其他所有相关样本之间的距离信息,因而该距离度量具有全局特性。在UCI标准数据集上的实验结果表明,SCED算法能有效提高聚类效果。

关键词: 谱聚类, 有效距离, 距离度量

Abstract: Based on existing distance metrics and the traditional spectral clustering algorithm, this paper proposes a new spectral clustering based on effective distance (SCED). Specifically, the proposed SCED algorithm uses effective distance to replace conventional Euclidean distance, by considering global properties of data that are reflected by sparse reconstruction coefficients. In effective distance, the similarity of a sample pair is evaluated by using not only the distance between these two samples, but also distances between one specific sample and other related samples. Sparse reconstruction coefficients are employed to reflect such global relationship among samples. The experimental results on ten UCI benchmark datasets demonstrate the efficiency of the proposed SCED algorithm.

Key words: spectral clustering, effective distance, distance metric