Journal of Frontiers of Computer Science and Technology ›› 2021, Vol. 15 ›› Issue (9): 1717-1727.DOI: 10.3778/j.issn.1673-9418.2006068

• Artificial Intelligence • Previous Articles     Next Articles

Deep Clustering Algorithm Based on Denoising and Self-Attention

CHEN Junfen, ZHANG Ming, ZHAO Jiacheng, XIE Bojun, LI Yan   

  1. 1. Hebei Key Laboratory of Machine Learning and Computational Intelligence, College of Mathematics and Information Science, Hebei University, Baoding, Hebei 071002, China
    2. School of Applied Mathematics, Beijing Normal University Zhuhai, Zhuhai, Guangdong 519087, China
  • Online:2021-09-01 Published:2021-09-06

结合降噪和自注意力的深度聚类算法

陈俊芬张明赵佳成谢博鋆李艳   

  1. 1. 河北大学 数学与信息科学学院 河北省机器学习与计算智能重点实验室,河北 保定 071002
    2. 北京师范大学珠海分校 应用数学学院,广东 珠海 519087

Abstract:

Recently, deep clustering methods have achieved perfect clustering performances, which simultaneously perform clusters assignment and features representation learning. However, the performances greatly degenerate with the decreasing of images quality such as noisy image. To this end, a novel deep clustering method DDC (deep denoising clustering) is proposed. A deep convolutional denoising auto-encoder is employed to learn the robust features representation from noisy image, and self-attention mechanism improves the ability of capturing local features. End-to-end jointly training obtains features more suitable to clustering tasks and then completes clustering assignment. The similarities between feature embeddings and cluster-centers are weighted by different coefficients to enlarge the differences between intra-clusters and inter-clusters. The experimental results on the public datasets demonstrate that the proposed DDC can provide better clustering performances. And compared with other deep clustering algorithms, for example, the clustering accuracy of DDC is 0.803 while DEC (deep embedding clustering) is 0.597 on the COIL-20 dataset. Overall, DDC algorithm with the help of deep convolutional denoising auto-encoder and self-attention can efficiently group noisy images, and further enlarges the application range of deep clustering analysis.

Key words: deep clustering, feature representation, denoising convolutional auto-encoder, self-attention mechanism, clustering accuracy

摘要:

近几年,联合聚类划分和表示学习的深度聚类方法提供了出色的聚类性能,但随着图像质量的下降(比如噪声图像),聚类结果还不能令人满意。为此,提出一种新的深度聚类算法(DDC)。深度卷积降噪自编码器学习噪声数据的特征表示;自注意力机制提高网络捕获局部关键信息的能力;端到端的联合训练得到合适的特征表示并完成聚类分配;对数据点和类中心的相似度赋予不同的权重,扩大同类和异类之间的差异。在公开图像数据集上的实验表明DDC算法的聚类性能更高;并与其他深度聚类算法进行对比,例如在COIL-20上DDC的聚类精度是0.803,而DEC算法仅是0.597。总之,结合自注意力和深度卷积降噪自编码器的DDC算法能对噪声图像进行更有效的聚类分析,扩大了图像聚类的应用范围。

关键词: 深度聚类, 特征表示, 卷积降噪自编码器, 自注意力机制, 聚类精度