计算机科学与探索 ›› 2022, Vol. 16 ›› Issue (5): 1053-1063.DOI: 10.3778/j.issn.1673-9418.2011003

• 数据库技术 • 上一篇    下一篇

模糊特征的top-k平均效用co-location模式挖掘

李金红, 王丽珍(), 周丽华   

  1. 云南大学 信息学院,昆明 650500
  • 收稿日期:2020-11-02 修回日期:2021-01-05 出版日期:2022-05-01 发布日期:2022-05-19
  • 通讯作者: + E-mail: lzhwang@ynu.edu.cn
  • 作者简介:李金红(1994—),女,云南曲靖人,硕士研究生,主要研究方向为空间数据挖掘。
    王丽珍(1962—),女,山西武乡人,博士,教授,博士生导师,CCF高级会员,主要研究方向为空间数据挖掘、交互式数据挖掘、大数据分析及其应用等。
    周丽华(1968—),女,云南华坪人,博士,教授,博士生导师,CCF会员,主要研究方向为数据挖掘、机器学习、社会网络分析。
  • 基金资助:
    国家自然科学基金(61966036);国家自然科学基金(61662086);云南省创新团队建设项目(2018HC019)

Top-k Average Utility Co-location Pattern Mining of Fuzzy Features

LI Jinhong, WANG Lizhen(), ZHOU Lihua   

  1. School of Information Science and Engineering, Yunnan University, Kunming 650500, China
  • Received:2020-11-02 Revised:2021-01-05 Online:2022-05-01 Published:2022-05-19
  • About author:LI Jinhong, born in 1994, M.S. candidate. Her research interest is spatial data mining.
    WANG Lizhen, born in 1962, Ph.D., professor, Ph.D. supervisor, senior member of CCF. Her research interests include spatial data mining, interactive data mining, big data analytics and their applications, etc.
    ZHOU Lihua, born in 1968, Ph.D., professor, Ph.D. supervisor, member of CCF. Her research interests include data mining, machine learning and social network analysis.
  • Supported by:
    National Natural Science Foundation of China(61966036);National Natural Science Foundation of China(61662086);Project of Innovative Research Team of Yunnan Province(2018HC019)

摘要:

空间并置(co-location)模式是指在空间邻域内空间特征的实例频繁地出现在一起所形成的非空特征子集。人们已经对确定数据和不确定数据的top-k空间co-location模式挖掘进行了相关研究,但是针对模糊特征的top-k平均效用co-location模式挖掘的研究还没有。提出模糊特征的top-k平均效用co-location模式挖掘。首先,定义了模糊特征的top-k平均效用co-location模式的相关概念,分析了模式的扩展模糊平均效用具有的“向下闭合”性质。其次,设计了一种基于扩展模糊平均效用值挖掘top-k平均效用co-location模式的算法,解决模糊平均效用不满足“向下闭合”性质的问题。在此基础上,又提出了一种基于局部扩展模糊平均效用的剪枝方法,有效地减小了top-k平均效用co-location模式挖掘的搜索空间,进一步提高了挖掘算法的效率。最后,在真实和合成数据集上验证了所提出算法的实用性、高效性和鲁棒性。

关键词: 空间co-location模式, 高平均效用, 模糊特征, top-k

Abstract:

The spatial co-location pattern refers to a subset of non-empty spatial features whose instances are frequently located together in a spatial neighborhood. Researchers have carried out relevant research of top-k spatial co-location pattern mining for deterministic data and uncertain data, but there is no research on top-k average utility co-location pattern mining for fuzzy features. Therefore, this paper proposes top-k average utility co-location pattern mining for fuzzy features. Firstly, the relevant concepts of top-k average utility co-location patterns of fuzzy features are defined, and the “downward close” nature of the extended fuzzy average utility of the pattern is analyzed. Secondly, an algorithm of mining top-k average utility co-location patterns based on extended fuzzy average utility value is designed,solving the problem that the fuzzy average utility does not satisfy the “downward close” nature. Thirdly, a pruning method based on a locally extended fuzzy average utility is proposed, which effectively reduces the search space for top-k average utility co-location pattern mining, and further improves the efficiency of the mining algorithm. Finally, the practicability, efficiency and robustness of the proposed algorithm are verified on real and synthetic datasets.

Key words: spatial co-location pattern, high-average utility, fuzzy feature, top-k

中图分类号: