计算机科学与探索 ›› 2018, Vol. 12 ›› Issue (8): 1238-1251.DOI: 10.3778/j.issn.1673-9418.1710030

• 系统软件与软件工程 • 上一篇    下一篇

基于众包和机器学习的移动应用隐私评级研究

张贤贤1,王浩宇1+,郭  耀2,徐国爱3   

  1. 1. 北京邮电大学 计算机学院 智能通信软件与多媒体北京市重点实验室,北京 100876
    2. 北京大学 信息科学技术学院 软件所 高可信软件技术教育部重点实验室,北京 100871
    3. 北京邮电大学 网络空间安全学院,北京 100876
  • 出版日期:2018-08-01 发布日期:2018-08-09

Privacy Rating for Mobile Apps Based on Crowdsourcing and Machine-Learning Techniques

ZHANG Xianxian1, WANG Haoyu1+, GUO Yao2, XU Guo'ai3   

  1. 1. Beijing Key Laboratory of Intelligent Telecommunication Software and Multimedia, School of Computer Science, Beijing University of Posts and Telecommunications, Beijing 100876, China
    2. Key Laboratory of High-Confidence Software Technologies (Ministry of Education), School of Electronics Engineering and Computer Science, Software Institute, Peking University, Beijing 100871, China
    3. School of Cyber Space Security, Beijing University of Posts and Telecommunications, Beijing 100876, China
  • Online:2018-08-01 Published:2018-08-09

摘要:

移动平台上广泛存在权限滥用的问题,在用户不知情的情况下,很多应用会获取并泄露用户的隐私信息。隐私信息的使用是否合理与其使用意图相关。为了实现基于用户期望对应用的敏感行为进行隐私评分,提出一种基于应用敏感权限使用意图的隐私评级模型,基于众包数据中421个用户16 651个数据对不同的<应用,权限,意图>组合的评分,使用机器学习技术建立准确的隐私评级预测模型。通过静态分析应用使用敏感权限的意图,使用隐私评级模型对应用进行评分。实验结果表明,所建立的隐私评级模型能够达到80.7%的准确率。通过将隐私评级模型应用于来自谷歌商店的11 931个应用,结果表明大约8%的应用存在严重的隐私风险。

关键词: 手机隐私, 权限, 意图, 隐私评分, 机器学习

Abstract:

Mobile Apps frequently request access to sensitive information without users' knowledge. Whether the sensitive information should be granted is related to the purpose of permission use. This paper proposes a privacy rating model to assess the privacy behavior of Android Apps based on the purpose of permission use and users' expectation. Based on 16651 crowdsourcing data of 421 users for the triple <app, permission, purpose>, this paper trains a privacy rating model based on machine-learning techniques. Then this paper uses static analysis to infer the purpose of permission use in the App, and grades the privacy by using privacy rating model. The experiment results show that the privacy rating model can achieve 80.7% accuracy. By applying the privacy rating model to 11931 Apps from Google Play, the results show that around 8% of Apps have serious privacy risks.

Key words: mobile privacy, permission, purpose, privacy score, machine learning