计算机科学与探索 ›› 2018, Vol. 12 ›› Issue (11): 1748-1757.DOI: 10.3778/j.issn.1673-9418.1709051

• 数据库技术 • 上一篇    下一篇

差分隐私流数据实时发布方法

葛晨,吴英杰,孙岚   

  1. 福州大学 数学与计算机科学学院,福州 350116
  • 出版日期:2018-11-01 发布日期:2018-11-12

Real-Time Publishing Method of Differential Privacy Streaming Data

GE Chen, WU Yingjie, SUN Lan   

  1. College of Mathematics and Computer Science, Fuzhou University, Fuzhou 350116, China
  • Online:2018-11-01 Published:2018-11-12

摘要:

许多流数据相关的实际应用需要进行大量的实时查询,现有的解决方案无法满足大量实时查询的效率要求。为此,提出一种差分隐私流数据实时发布方法。首先利用树状数组构建滑动窗口内流数据对应的统计发布模型,可在线性时间内实现滑动窗口下的连续统计发布,随后通过连续统计发布结果的线性组合即可在[O(1)]时间内获得用户需要的任意区间查询结果;其次,利用矩阵在处理关联性查询方面的优势,在查询效率量级不变的前提下利用对角矩阵优化进一步提高查询精度。实验对所提算法的查询效率和查询精度与同类算法进行比较分析,实验结果表明,该方法可显著提升查询效率并具有较优的查询精度。

关键词: 差分隐私, 流数据发布, 查询效率, 矩阵机制

Abstract:

At present, many of the practical applications associated with streaming data require a large number of real-time queries, and existing solutions cannot meet the efficiency requirements of a large number of real-time queries. To this end, this paper presents a real-time release of differential privacy streaming data. The basic idea is to use the tree array to construct the statistical release model corresponding to the streaming data in the sliding window. The model can realize the continuous statistical release under the sliding window in the linear time, and then the linear combination of the results can be published by the continuous statistics. In the O(1) time it can obtain any interval query results which user needs; then, using the advantage of matrix in dealing with relevance query, the query accuracy is further improved by using diagonal matrix optimization under the condition that the efficiency of the query is the same order of magnitude. The query efficiency and query accuracy of the proposed algorithm are compared with the other arbitrary interval query release algorithms in the sliding window. The experimental results show that the method can improve the query efficiency and has better query accuracy.

Key words: differential privacy, streaming data publication, query efficiency, matrix mechanism