Journal of Frontiers of Computer Science and Technology ›› 2021, Vol. 15 ›› Issue (2): 366-378.DOI: 10.3778/j.issn.1673-9418.2004040

• Theory and Algorithm • Previous Articles     Next Articles

Minimal Negative Co-location Patterns and Effective Mining Algorithm

WANG Guangyao, WANG Lizhen, YANG Peizhong, CHEN Hongmei   

  1. School of Information Science and Engineering, Yunnan University, Kunming 650504, China
  • Online:2021-02-01 Published:2021-02-01

极小负co-location模式及有效的挖掘算法

王光耀王丽珍杨培忠陈红梅   

  1. 云南大学 信息学院,昆明 650504

Abstract:

A spatial co-location pattern refers to a subset of spatial features whose instances are frequently co-located in spatial. In spatial data mining, most of the existing algorithms aim to discover the positive patterns. Moreover, there are patterns with strong negative correlations in spatial, such as negative co-location patterns. The discovery of such patterns is also greatly significant in some applications. Existing negative co-location patterns mining algori-thm is time-consuming and the number of mining results is huge. To address these problems, this paper explores the upward inclusion property of the negative co-location pattern, and proposes a minimal negative co-location pattern. Based on the minimal negative co-location patterns, all prevalent negative co-location patterns can be derived. In the negative co-location pattern mining process, the calculation of table instances of the candidate patterns is the funda-mental factor that restricts the mining efficiency. In order to reduce the calculation of the table instance effectively, three pruning strategies are proposed. A large number of experiments on real and synthetic data sets verify the correctness and efficiency of the proposed algorithm. In particular, the experimental results show that the minimal negative co-location patterns can compress the number of prevalent negative co-location patterns by more than 80%.

Key words: spatial data mining, spatial co-location pattern, minimal negative co-location pattern, upward inclusion, compact representation

摘要:

空间co-location(并置)模式是指实例在空间中频繁关联的一组空间特征的子集。在空间数据挖掘中,现有算法主要针对的是正模式的挖掘,而空间中还存在着具有强负相关性的模式,如负co-location模式,这类模式的挖掘在一些应用中同样具有重要的意义。现有的负co-location模式挖掘算法的时间复杂度较高,挖掘到的模式数量巨大。针对该问题,探索了负co-location模式的向上包含性质,提出了极小负co-location模式,证明了极小负co-location模式可推导出所有频繁负co-location模式。在负co-location模式挖掘中,计算模式的表实例是制约挖掘效率的根本因素,为此提出了3个剪枝策略有效地提高了算法的效率。在真实和合成数据集上的大量实验,验证了提出方法的正确性和高效性。特别地,大量实验结果表明极小负co-location模式可将频繁负co-location模式数量压缩80%以上。

关键词: 空间数据挖掘, 空间co-location模式, 极小负co-location模式, 向上包含, 紧凑表示