计算机科学与探索 ›› 2012, Vol. 6 ›› Issue (11): 985-993.DOI: 10.3778/j.issn.1673-9418.2012.11.003

• 学术研究 • 上一篇    下一篇

融合Shadowed Sets聚类的离群点检测算法

王  丹+,毛紫阳,吴孟达   

  1. 国防科学技术大学 理学院 数学与系统科学系,长沙 410073
  • 出版日期:2012-11-01 发布日期:2012-11-02

Outlier Detection Algorithm on Shadowed Sets Clustering

WANG Dan+, MAO Ziyang, WU Mengda   

  1. Department of Mathematics and System Science, College of Science, National University of Defense Technology, Changsha 410073, China
  • Online:2012-11-01 Published:2012-11-02

摘要: 从数据整体和宏观特点给出了离群点的新的定义,并基于数据宏观模式定义了一种新的离群因子,该因子考虑了数据点偏离数据模式的程度和数据点本身归类的不确定性;提出了一种新的Shadowed Sets优化目标,使得在模糊集阴影化过程中更加关注核的准确性;同时基于Shadowed Sets聚类,提出了一种结合聚类的离群点检测算法,该算法可以同时进行聚类和离群点检测;通过模拟数据和Iris数据测试,显示算法具有较好的检测效果。  

关键词: 离群点, 聚类, 阴影集

Abstract: This paper proposes a new definition for outliers from the macroscopic characteristics of data sets, and designs a new outlier factor of observation (COF) by considering both deviation of outlier to clusters and uncertainty of outliers itself. The paper gives a new optimization goal on Shadowed Sets, which pays more attention to the accuracy of core in the shadowed process of fuzzy sets. Further, the paper develops an outlier detection algorithm based on Shadowed Sets clustering to incorporate the advantages of both COF and Shadowed Sets in a hybridized framework. The experimental results on synthetic and Iris data sets demonstrate better effect of the proposed approach.

Key words: outlier, clustering, Shadowed Sets