Journal of Frontiers of Computer Science and Technology ›› 2021, Vol. 15 ›› Issue (3): 389-402.DOI: 10.3778/j.issn.1673-9418.2009071

• Surveys and Frontiers • Previous Articles     Next Articles

Overview of Privacy Protection Technology of Big Data in Healthcare

GUO Zijing, LUO Yuchuan, CAI Zhiping, ZHENG Tengfei   

  1. College of Computer, National University of Defense Technology, Changsha 410073, China
  • Online:2021-03-01 Published:2021-03-05

医疗健康大数据隐私保护综述

郭子菁罗玉川蔡志平郑腾飞   

  1. 国防科技大学 计算机学院,长沙 410073

Abstract:

With the popularization of smart mobile devices, the digitalization of medical devices and the structuring of electronic medical records, medical data have shown the characteristics of explosive growth. It has attracted wide attention to improve the understanding of the real value of big data in healthcare, by doing in-depth research and discussion on its development regulation. However, the issue how to protect the privacy security effectively in the process deserves attentions as well. Due to the characteristics of big data in healthcare and the storage environment, privacy protection faces severe challenges. First, the related concepts and characteristics of big data in healthcare are introduced. Then, focusing on the four stages of the life cycle model of big data in healthcare, which includes data collection, storage, share and analysis, this paper respectively introduces the risks and challenges it faces and the corresponding privacy protection technologies, analyzing the merits, drawbacks and their applicable scope. When collecting, anonymous technology and differential privacy can resist attacks based on background knowledge brought by data integration and fusion. During the storage stage, big data in healthcare are mostly stored on the cloud platform. Encryption and audit are often used for the confidentiality and integrity of data. During the data share stage, the access control plays an important part in controling the object to obtain data. During the analysis stage, privacy protection of big data in healthcare is achieved based on the framework of machine learning. Last but not least, regarding to the universal privacy protection challenges throughout the life cycle of big data in healthcare, reasonable suggestions are proposed in the management level.

Key words: big data in healthcare, life cycle, privacy protection technology

摘要:

随着智能移动设备普及化、医疗设备数字化及电子病历结构化的推进,医疗数据呈现爆发增长的特点。在深入研究探讨医疗大数据发展规律,提高对医疗大数据真实价值的认识的同时,如何有效保护数据的隐私安全现已成为广受关注的重要议题。医疗大数据自身特点以及存储环境等都为隐私保护带来了不小的挑战。首先,介绍了医疗大数据的相关概念以及特点。然后,围绕医疗大数据生命周期的四个阶段数据的采集、存储、共享以及分析,分别介绍面临的风险挑战以及相应的隐私保护技术,并对不同技术的优缺点、适用范围等进行分析。在数据采集时,匿名技术、差分隐私可以抵御数据集成融合带来的基于背景知识的攻击。在存储阶段,医疗大数据多存储于云平台,为了数据的机密性和完整性,常使用加密、审计的方法。在数据共享阶段,主要使用访问控制方法来控制获取数据的对象。在数据分析阶段,在机器学习框架下对医疗健康大数据进行隐私保护。最后,针对贯穿医疗大数据生命周期的普遍隐私保护挑战,从管理的层面提出合理的建议。

关键词: 医疗大数据, 生命周期, 隐私保护技术