计算机科学与探索 ›› 2010, Vol. 4 ›› Issue (7): 662-672.DOI: 10.3778/j.issn.1673-9418.2010.07.010

• 学术研究 • 上一篇    

自适应分形聚类进化甄别算法*

闫光辉 1+, 董晓慧1, 刘云 1, 贺少领1, 马志程2   

  1. 1. 兰州交通大学电子与信息工程学院, 兰州 730070
    2. 甘肃电力信息通信中心, 兰州 730050
  • 收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2010-07-14 发布日期:2010-07-14
  • 通讯作者: 闫光辉

Self-Adaptive Fractal Technique on Detecting Cluster Evolution*

YAN Guanghui1+, DONG Xiaohui1, LIU Yun1, HE Shaoling1, MA Zhicheng2   

  1. 1. School of Electronic and Information Engineering, Lanzhou Jiaotong University, Lanzhou 730070, China
    2. Gansu Electric Power Information and Communication Centre, Lanzhou 730050, China
  • Received:1900-01-01 Revised:1900-01-01 Online:2010-07-14 Published:2010-07-14
  • Contact: YAN Guanghui

摘要: 数据流随时间演变具有突发性及随机性的特点, 如何自适应、实时追踪这种变化是数据流挖掘面临的一个重要问题, 完全由用户通过试探来甄别这种变化在实际中无法实现, 同时也失去了数据流聚类进化追踪的现实意义。针对聚类变化自动追踪问题, 考虑到现实的计算资源限制和处理速度要求, 结合分形聚类、自适应采样技术与Chernoff 不等式, 提出了数据流聚类演变实时追踪算法, 进行聚类演变的自动追踪; 通过合成与实际数据集上的实验工作验证了算法的有效性。

关键词: 数据挖掘, 聚类进化, 分形, 自适应采样

Abstract: Stream data can often show important changes in trends over time. In such cases, it is useful to understand, visualize and diagnose the evolution of these trends. When the data streams are fast and continuous, it becomes important to analyze and predict the trends quickly in online fashion. This paper discusses the real-time clustering evolution tracking for data stream algorithm which integrates the fractal cluster technique, self-adaptive sampling technique with the restriction of computing resource and the requirement of processing speed, and can discriminate the cluster evolution of stream data on time. The experiments over a number of real and synthetic data sets illustrate the effectiveness and efficiency provided by this approach.

Key words: data mining, cluster evolution, fractal, self-adaptive sampling

中图分类号: