Journal of Frontiers of Computer Science and Technology ›› 2017, Vol. 11 ›› Issue (9): 1405-1417.DOI: 10.3778/j.issn.1673-9418.1609029

Previous Articles     Next Articles

User Preferences Prediction Based on Multidimensional Features of Apps

CHEN Zhenpeng1,2, LU Xuan1,2, LI Huoran1,2, LIU Xuanzhe1,2+   

  1. 1. Key Lab of High Confidence Software Technologies (Peking University), Ministry of Education, Beijing 100871, China
    2. Peking University Information Technology Institute (Tianjin Binhai), Tianjin 300450, China
  • Online:2017-09-01 Published:2017-09-06

多维应用特征融合的用户偏好预测

陈震鹏1,2,陆  璇1,2,李豁然1,2,刘譞哲1,2+   

  1. 1. 北京大学 高可信软件技术教育部重点实验室,北京 100871
    2. 北京大学(天津滨海)新一代信息技术研究院,天津 300450

Abstract: In recent years, the rapid development of smartphones has brought an explosion of mobile applications (a.k.a. Apps). Thus, it would be of App developers’ interest to predict user preferences of their Apps in advance. This paper leverages the uninstallation/installation ratio as an implicit indicator and the favorable rating as an explicit one of user preferences. User-activity data involved in this research is collected from a popular App store in China, spanning five months from May to September, 2014. 9795 Apps are selected, each covering no less than 50 active users. This paper employs 30 features from 7 dimensions that might be correlated with user preferences, and extracts these features from the 9795 Apps. Then, this paper builds Random-Forest classifiers to distinguish Apps between high and low uninstallation/installation ratios, and Apps between high and low favorable ratings. In addition, this paper finds the variables, which can notably influence uninstallation/installation ratio and favorable rating, out of the 30 features involved.

Key words: Android Apps, App features, user preferences

摘要: 近年来,随着智能手机的飞速发展,移动应用的数目也快速增长。因此,移动应用开发者会提前预测用户对于自己开发的应用的偏好情况。选取Android应用的被卸载次数与其被下载次数的比值作为用户偏好的隐式反映,用户对应用的评价(喜爱率)作为用户偏好的显式反映。基于国内某知名手机应用市场提供的2014年5月至9月的大规模真实用户使用数据,选取9 795个活跃用户数不少于50的Android手机应用作为研究对象,进行分析。从7个维度定义了可能影响用户对应用偏好的30种特征,并对每个应用进行特征提取。基于定义的特征,使用随机森林算法训练分类器,按照卸载/下载比率或喜爱率的高低对应用进行划分,并找出显著影响卸载/下载比率、喜爱率的特征。

关键词: Android手机应用, 应用特征, 用户偏好