计算机科学与探索 ›› 2008, Vol. 2 ›› Issue (3): 330-336.

• 学术研究 • 上一篇    

数据流中基于矩阵的频繁项集挖掘

王 磊+,黄志球,朱小栋,沈国华,程 亮   

  1. 南京航空航天大学 信息科学与技术学院,南京 210016
  • 收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2008-06-20 发布日期:2008-06-20
  • 通讯作者: 王 磊

Mining frequent itemsets over data stream by matrix

WANG Lei+, HUANG Zhiqiu, ZHU Xiaodong, SHEN Guohua, CHENG Liang   

  1. College of Information Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing 210016, China
  • Received:1900-01-01 Revised:1900-01-01 Online:2008-06-20 Published:2008-06-20
  • Contact: WANG Lei

摘要: 挖掘频繁项集是挖掘数据流的基本任务。许多近似算法能够有效地对数据流进行频繁项挖掘,但不能有效地控制内存资源消耗和挖掘运行时间。为了提高数据流频繁项集挖掘的时空效率,通过引入矩阵作为概要数据结构,提出了一种新的数据流频繁项集挖掘算法。最后通过实验证明了该算法的有效性。

关键词: 数据流, 数据挖掘, 频繁模式, 矩阵

Abstract: Mining frequent itemsets is a basic task of the data stream mining. Recently many approximate algorithms can mine frequent itemsets over data stream. However, these algorithms still can not efficiently reduce space and time cost. To improve the efficiency of mining frequent itemsets over data stream, matrix is imported as the synopsis data structure and a new algorithm of mining frequent itemsets is presented. Finally, experiments prove the efficiency of this algorithm.

Key words: data stream, data mining, frequent itemsets, matrix