计算机科学与探索 ›› 2011, Vol. 5 ›› Issue (10): 953-958.

• 学术研究 • 上一篇    

密度网格参数自适应的数据流聚类算法

邢长征, 王 飞, 王丽丽   

  1. 1. 辽宁工程技术大学 电子与信息工程学院, 辽宁 葫芦岛 125105
    2. 辽宁工业大学 电子与信息工程学院, 辽宁 锦州 121001
  • 收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2011-10-01 发布日期:2011-10-01

Density Grid-Based Data Stream Clustering Algorithm with Parameter Automatization

XING Changzheng, WANG Fei, WANG Lili   

  1. 1. School of Electronics and Information Engineering, Liaoning Technical University, Huludao, Liaoning 125105, China 2. School of Electronics and Information Engineering, Liaoning University of Technology, Jinzhou, Liaoning 121001, China

  • Received:1900-01-01 Revised:1900-01-01 Online:2011-10-01 Published:2011-10-01

摘要: 针对传统密度网格算法在聚类中自动获取密度阈值不够精确的问题, 提出了一种基于密度网格参数自适应的数据流聚类算法A-Stream。通过引入“双密度阈值”, 并以平均值作为密度阈值, 对传统聚类算法进行了改进, 解决了算法不能获取精确值的问题。实验结果表明, A-Stream算法不仅保留了传统密度网格算法的高效性, 而且较大程度上提高了聚类精度。

关键词: 聚类, 数据流, 网格, 参数自适应, 密度阈值

Abstract: For the problem that traditional density grid-based stream clustering algorithm cannot get accurate density value, this paper introduces a new density grid-based stream clustering algorithm with parameter automatization A-Stream. Through the introduction of the double density, the traditional density grid-based clustering algorithm for data stream is improved by taking the average as the grid density, resolving the problem that algorithm cannot get accurate value automatically. The experimental results show that not only the high efficiency of the grid-based algo-rithm is utilized, but also the clustering accuracy is raised significantly.

Key words: clustering, data stream, grid, parameter adaptation, density threshold