计算机科学与探索 ›› 2016, Vol. 10 ›› Issue (4): 554-564.DOI: 10.3778/j.issn.1673-9418.1505041

• 人工智能与模式识别 • 上一篇    下一篇

面向多视角数据的极大熵聚类算法

张丹丹+,邓赵红,王士同   

  1. 江南大学 数字媒体学院,江苏 无锡 214122
  • 出版日期:2016-04-01 发布日期:2016-04-01

Maximum Entropy Clustering Algorithm for Multi-View Data

ZHANG Dandan+, DENG Zhaohong, WANG Shitong   

  1. School of Digital Media, Jiangnan University, Wuxi, Jiangsu 214122, China
  • Online:2016-04-01 Published:2016-04-01

摘要: 当前,极大熵聚类(maximum entropy clustering,MEC)在面对多视角聚类任务时,是将多视角样本合并成为一个整体样本再进行处理,然而这样会破坏各视角的独立性特征,进而影响最终的划分结果。针对该问题,首先提出多视角协同划分极大熵聚类算法(multi-view collaborative partition MEC,CoMEC),该算法加入一个协调各视角空间划分的约束项,使得每一视角在单独聚类过程中考虑到其他视角的影响;然后通过区分每个视角的重要性将CoMEC算法扩展为视角加权版本,即视角加权协同划分极大熵聚类算法(view weighted collaborative partition MEC,W-CoMEC);最后利用几何均值的集成策略得到全局性的划分结果。在人工数据集以及UCI数据集上的实验结果均显示所提算法较之已有的聚类技术在应对多视角聚类任务时具有更好的聚类性能。

关键词: 熵, 多视角聚类, 划分, 权值, 集成策略, UCI数据集

Abstract: Currently, the maximum entropy clustering (MEC) merges the multi-view samples to process the multi-view clustering task. However, this will damage the independence of each view, and affect the final partition results. Aiming at this problem, this paper proposes a multi-view collaborative partition maximum entropy clustering (CoMEC) algorithm, which joins a constraint to coordinate each perspective space partition, to make each view in a separate clustering process consider the influence of other views. Then this paper proposes the enhanced weighted view version called W-CoMEC by identifying the importance of each view. Finally this paper applies the geometric average integration strategy to obtain the global partition results. The experimental results on a synthetic multi-view dataset and several UCI real-world multi-view datasets show that the proposed algorithm outperforms or is at least comparable to the     existing clustering technology in dealing with multi-view clustering task.

Key words: entropy, multi-view clustering, partition, weight, integration strategy, UCI dataset