计算机科学与探索 ›› 2019, Vol. 13 ›› Issue (8): 1360-1369.DOI: 10.3778/j.issn.1673-9418.1812023

• 人工智能 • 上一篇    下一篇

PID参数调节的谱多流形聚类算法研究

罗养霞,马迪,常言说   

  1. 1.西安财经大学 信息学院,西安 710100
    2.密西根大学 迪尔伯恩分校 计算机与信息科学系,美国 迪尔伯恩 48124
    3.计算机应用与商务智能研究中心,西安 710100
  • 出版日期:2019-08-01 发布日期:2019-08-07

Spectral Multi-Manifold Clustering Based on PID Parameter Adjustment

LUO Yangxia, MA Di, CHANG Yanshuo   

  1. 1.School of Information, Xi’an University of Finance and Economics, Xi??an 710100, China
    2.Department of Computer and Information Science, University of Michigan-Dearborn, Dearborn 48124, USA
    3.Research Center for Computer Application and Business Intelligence, Xi??an 710100, China
  • Online:2019-08-01 Published:2019-08-07

摘要: 数据的复杂和多样性使得对大数据处理和分析能力有更高的要求。流形聚类在数据挖掘中取得显著的成功,但参数调整是聚类算法研究中的难点之一,直接影响聚类性能。传统的聚类算法参数调节一般依赖于经验,或者因参数调节的盲目性和随机性,而使得算法失效或复杂度较高。提出了一种基于比例-积分-微分(PID)控制约束的主动控制机制,约束谱多流形聚类参数调整的新方法。通过构造相似度矩阵,使用多个主成分分析器来估计局部切线空间。模型逼近过程由参数传递和PID调节控制。在调整过程中,采用三维ZN方法调整模型参数,扩展搜索空间,根据反馈结果控制谱多流形聚类过程,提高了算法的准确性和复杂性。通过在合成和实际中的不同类型的数据特征集进行检验,可以获得更好的聚类性能。

关键词: 谱多流形聚类, 子空间聚类, 聚类参数调节, 比例-积分-微分(PID)

Abstract: The complexity and diversity of data demand more in big data processing and analysis. Manifold clustering techniques have shown remarkable success in numerous problems in data mining. Parameter adjustment is one of the difficulties in the research of clustering algorithm, which directly affects clustering performance. Traditional parameter adjustment methods of clustering algorithm depend on the experience, or the blindness and randomness of the parameter adjustment make the algorithm invalid or more complex. In this paper, a novel method to control the adjustment of spectral multi-manifold clustering parameters actively, based on proportional-integral-derivative (PID) constraints, is proposed. By constructing the similarity matrix, multiple principal component analyzers are used to estimate the local tangent space. The model approximation process is controlled by parameter transfer and PID adjustment. In PID adjustment, the three-dimensional ZN method is used to adjust model parameters and to extend the search space, so that the clustering process is controlled with feedback results, and thus the accuracy and complexity of the clustering algorithm are improved. Better clustering performance can be obtained by detecting different types of data feature sets in synthetic data and real data.

Key words: spectral multi-manifold clustering, subspace clustering, clustering parameter adjustment, proportional-integral-derivative (PID)