计算机科学与探索 ›› 2021, Vol. 15 ›› Issue (11): 2151-2160.DOI: 10.3778/j.issn.1673-9418.2008099

• 数据库技术 • 上一篇    下一篇

基于位置的稀疏群体查询

李娜,朱怀杰,刘威,印鉴   

  1. 1. 中山大学 数据科学与计算机学院,广州 510006
    2. 广东省大数据分析与处理重点实验室,广州 510006
  • 出版日期:2021-11-01 发布日期:2021-11-09

Geo-Socially Tenuous Group Query

LI Na, ZHU Huaijie, LIU Wei, YIN Jian   

  1. 1. School of Data and Computer Science, Sun Yat-Sen University, Guangzhou 510006, China
    2. Guangdong Key Laboratory of Big Data Analysis and Processing, Guangzhou 510006, China
  • Online:2021-11-01 Published:2021-11-09

摘要:

在基于位置的社交网络中,找到一个特殊的群体/社区是非常重要的。现在的研究都集中于寻找群体之间关系紧密的密集子图。相对于紧密的群体/子图,对于稀疏群的研究少之又少。虽然现有工作开始研究稀疏群体查询问题,但是还没有研究基于位置的稀疏群体查询问题,而基于位置的服务在现实生活中有很多需求。因此,研究基于位置的稀疏群查询的问题变得有研究价值。基于位置的稀疏群体查询是为了找到一群用户,不仅用户之间满足一定的稀疏性(即用户之间的社交距离大于[k]),且最小化用户到查询位置的距离和。针对这个问题,首先提出基于c-邻居的基本处理算法(简称baseline),其主要利用存储的c-邻居信息以及距离剪枝来帮助快速获得查询结果。但是baseline算法的空间消耗太大,且在稀疏阈值参数[k>c]时查询效率不高。为了解决这些问题,进一步提出基于c-邻居和反向c-邻居的查询优化算法(简称ICN),不仅利用存储的c-邻居且利用反向c-邻居信息来处理参数[k>c]的情况,从而快速获得查询结果。实验结果和理论表明,提出的两种查询处理方法是有效的和正确的。

关键词: 基于位置的社交网络图, 稀疏群, c-邻居, 距离剪枝

Abstract:

Compared with the dense group/subgraph, there are few studies on tenuous groups. Although the existing work has begun to study the tenuous population query, geo-socially tenuous group query has not been studied, and location-based services have a lot of demands in real life. Therefore, it becomes valuable to study the geo-socially tenuous group query. Geo-socially tenuous group query is to find a group of users, which not only satisfies a certain sparsity between users (i.e. the social distance between users is greater than [k]), but also minimizes the distance between users and the query location. To address this problem, this paper first proposes a basic processing algorithm based on c-neighbor (baseline), which uses stored c-neighbor information and distance pruning to help obtain query results quickly. However, the basic processing algorithm based on c-neighbor (baseline) uses too much space and the query efficiency is not high when parameter [k>c]. To solve these problems, a query optimization algorithm based on c-neighbor and reverse c-neighbor (ICN) is proposed, which not only utilizes stored c-neighbor information but also reverse c-neighbor information to effectively filter out invalid users and obtain query results quickly. The experimental results and theory show that the proposed two query processing methods are effective and correct.

Key words: location-based social network graph, tenuous group, c-neighbor, distance pruning