计算机科学与探索 ›› 2010, Vol. 4 ›› Issue (10): 909-917.DOI: 10.3778/j.issn.1673-9418.2010.10.005

• 学术研究 • 上一篇    下一篇

面向事件流的频繁片断计数算法*

黄 鹏+, 王 鹏, 汪 卫

  

  1. 复旦大学 计算机科学技术学院, 上海 200433
  • 收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2010-10-01 发布日期:2010-10-01
  • 通讯作者: 黄 鹏

Fast Algorithm for Frequent Episodes Counting in Event Stream*

HUANG Peng+, WANG Peng, WANG Wei   

  1. School of Computer Science and Technology, Fudan University, Shanghai 200433, China
  • Received:1900-01-01 Revised:1900-01-01 Online:2010-10-01 Published:2010-10-01
  • Contact: HUANG Peng

摘要: 在事件流上挖掘频繁片断已经成为近来研究的热点, 在很多应用中起到重要作用。以往的研究提出了一些挖掘算法, 包括基于滑动窗口和基于非重叠出现的方法。然而, 这些算法在处理基于片断互异出现的支持度计数时, 效率很低甚至无效。为此, 提出了一种包含状态计数的有限状态自动机模型, 并使用该模型给出了一种高效挖掘算法。从理论上对算法的效率和有效性进行了分析; 实验结果证明了算法是有效且高效的。

关键词: 事件流, 频繁片断挖掘, 互异出现计数, 数据挖掘

Abstract: With occurrences of many related applications, recently frequent episodes mining in event stream has become a hot research topic in data mining field. There exists some algorithms for frequent episodes mining such as window-based and non-overlapped occurrences based methods. However, these algorithms are not efficient, or even incapable, when dealing with the distinct occurrences based frequency counting of an episode which is essentially more effective. To solve the problem, this paper introduces a finite state automaton with state counting, and conse-quently proposes an efficient algorithm based on this model. It also gives theoretical analysis about the cost and effectiveness of this algorithm. Experimental results also verify the efficiency and effectiveness of the proposed algorithm.

Key words: event stream, frequent episodes discovery, distinct occurrences, data mining

中图分类号: