计算机科学与探索 ›› 2011, Vol. 5 ›› Issue (8): 751-759.

• 学术研究 • 上一篇    下一篇

K-匿名隐私保护模型下的Top-k查询

辛婷婷, 刘国华   

  1. 东华大学 计算机科学与技术学院, 上海 201620
  • 收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2011-08-01 发布日期:2011-08-01

Top-k Queries under K-Anonymity Privacy Protection Model

XIN Tingting, LIU Guohua   

  1. School of Computer Science and Technology, Donghua University, Shanghai 201620, China
  • Received:1900-01-01 Revised:1900-01-01 Online:2011-08-01 Published:2011-08-01

摘要: 数据查询问题是K-匿名隐私保护模型下数据可用性问题之一。提出一种K-匿名数据的空间数据组织方法及其索引方法; 定义了两种新的查询UK-Rank和NT-Rank, UK-Rank主要应用于一些需要排序的查询, NT-Rank应用于点查询或者范围查询; 采用了Monte-Carlo积分近似计算的抽样方法来提高查询效率。对提出的相关算法进行了实验, 结果表明, 将K-匿名数据组织成空间数据的方法是可行的, 并且应用抽样方法后, 查询效率大大提高。

关键词: Top-k查询, K-匿名数据, 不确定数据库, 偏序, R-tree

Abstract: How to answer queries under the K-anonymity privacy protection model is one of problems for the availability of anonymized data. This paper proposes a translation method from K-anonymized data to spatial data and an indexing method. It defines two new queries for the availability of anonymized data, UK-Rank and NT-Rank. UK-Rank is mainly used in queries that require sorting, NT-Rank is used in the point query or range query. The Monte-Carlo integration is used to compute accurate estimate of probability and improves query efficiency. Finally, related experiments are conducted. The experimental results show that the translation from K-anonymized data to spatial data is feasible, and the query efficiency is greatly improved after the application of sampling methods.

Key words: Top-k query, K-anonymized data, uncertain database, partial orders, R-tree