计算机科学与探索 ›› 2022, Vol. 16 ›› Issue (9): 2089-2095.DOI: 10.3778/j.issn.1673-9418.2012071

• 人工智能 • 上一篇    下一篇

改进的增量式动静结合协同过滤方法

武美1, 丁怡彤2, 赵建立1,+()   

  1. 1.山东科技大学 计算机科学与工程学院,山东 青岛 266590
    2.东北林业大学 信息与计算机工程学院,哈尔滨 150040
  • 收稿日期:2020-12-07 修回日期:2021-01-29 出版日期:2022-09-01 发布日期:2021-02-04
  • 通讯作者: + E-mail: jlzhao@sdust.edu.cn
  • 作者简介:武美(1996—),女,山东临沂人,硕士研究生,主要研究方向为个性化推荐。
    丁怡彤(2000—),女,山东青岛人,主要研究方向为信息系统。
    赵建立(1977—),男,博士,教授,博士生导师,主要研究方向为普适计算、个性化推荐、室内定位。
  • 基金资助:
    国家重点研发计划(2018YFC0831002);教育部人文社会科学基金(18YJAZH136);国家自然科学基金(61433012);国家自然科学基金(U1435215)

Improved Incremental Dynamic and Static Combined Collaborative Filtering Method

WU Mei1, DING Yitong2, ZHAO Jianli1,+()   

  1. 1. College of Computer Science and Engineering, Shandong University of Science and Technology, Qingdao, Shan- dong 266590, China
    2. College of Information and Computer Engineering, Northeast Forestry University, Harbin 150040, China
  • Received:2020-12-07 Revised:2021-01-29 Online:2022-09-01 Published:2021-02-04
  • About author:WU Mei, born in 1996, M.S. candidate. Her research interest is personalized recommendation.
    DING Yitong, born in 2000. Her research interest is information system.
    ZHAO Jianli, born in 1977, Ph.D., professor, Ph.D. supervisor. His research interests include pervasive computing, personalized recommendation and indoor positioning.
  • Supported by:
    National Key Research and Development Program of China(2018YFC0831002);Humanities and Social Sciences Foundation of Ministry of Education of China(18YJAZH136);National Natural Science Foundation of China(61433012);National Natural Science Foundation of China(U1435215)

摘要:

矩阵分解算法在推荐系统中因其具有较高的预测精度和良好的扩展性已被广泛运用,然而当前的矩阵分解算法大多处理的是静态数据,随着训练数据的逐渐增加,传统的矩阵分解方法需要对已有全部数据进行重新训练以更新模型,这样随之带来的时间花销和计算成本也大大增加。因此,如何在短时间内进行物品的评分预测以进行合理准确的推荐是研究的主要问题。针对此问题,提出了一种改进的增量式矩阵分解算法,主要思想是在预测过程中,根据评分来源分区域处理数据。分区域处理的方法可以有效地缩短计算的时间,并且将精度保持在一定范围内。在静态训练模块,初始的用户和物品的特征训练将不占用在线训练时间,并且在初始数据量较大时可以获得较好的精度;在动态训练模块中,对新进入的用户集和物品集上的相应评分提取并训练得到相应的小动态矩阵,在此之后动态维护和更新小矩阵,在此小矩阵上进行后续的特征训练。同时,为了在保证训练精度的同时降低动态矩阵的训练时间,采用了一种基于随机梯度下降方法的快速更新策略,该方法有效缩短了时间并且提高了一部分精度。在两个公开的数据集上的实验结果证明了此算法的优越性。

关键词: 推荐系统, 协同过滤, 矩阵分解, 增量模型, 冷启动

Abstract:

Matrix factorization algorithm has been widely used in recommender system because of its high predic-tion accuracy and good scalability. However, most of the current matrix factorization algorithms deal with static data. With the gradual increase of training data, traditional matrix factorization method needs to retrain all existing data to update the model, which brings about increase of time cost and calculation cost. Therefore, how to score and predict items in a short time to make reasonable and accurate recommendation is the main problem of research. In order to solve this problem, an improved incremental matrix factorization algorithm is proposed. The main idea is to process the data in different regions according to score source in prediction process. The method can effectively shorten the calculation time and keep the accuracy within a certain range. In the static training module, the initial user and item feature training will not occupy the online training time, and can obtain better accuracy when the initial data are large; in the dynamic training module, the corresponding small dynamic matrix is extracted and trained from the corresponding scores of new user set and item set, and then the small dynamic matrix is dynamically maintained and updated. Subsequent feature training is performed on this small matrix. At the same time, in order to ensure the training accuracy and reduce the training time of dynamic matrix, a fast update strategy based on random gradient descent method is adopted. This method effectively shortens the time and improves part of the accuracy. Experimental results on two open datasets show the superiority of the proposed algorithm.

Key words: recommender systems, collaborative filtering, matrix factorization, incremental model, cold start

中图分类号: