Journal of Frontiers of Computer Science and Technology ›› 2011, Vol. 5 ›› Issue (09): 857-864.

• 学术研究 • Previous Articles    

SlopeOne Collaborative Filtering Recommendation Algorithm Based on Dynamic
k-Nearest-Neighborhood

SUN Limei, LI Jingjiao, SUN Huanliang   

  1. 1. Information and Control Engineering Faculty, Shenyang Jianzhu University, Shenyang 110168, China
    2. College of Information Science and Engineering, Northeastern University, Shenyang 110819, China
  • Received:1900-01-01 Revised:1900-01-01 Online:2011-09-01 Published:2011-09-01

基于动态k近邻的SlopeOne协同过滤推荐算法

孙丽梅, 李晶皎, 孙焕良   

  1. 1. 沈阳建筑大学 信息与控制工程学院, 沈阳 110168
    2. 东北大学 信息科学与工程学院, 沈阳 110819

Abstract: Collaborative filtering is one of widely-used techniques in recommendation systems. Data sparsity is a main factor which affects the prediction accuracy of collaborative filtering. SlopeOne algorithm uses linear regression model to solve data sparsity problem. k-nearest-neighborhood method based on users similarities can optimize the quality of ratings made by users participating in prediction. Based on SlopeOne algorithm, this paper presents a new collaborative filtering algorithm combining dynamic k-nearest-neighborhood and SlopeOne. Firstly, different numbers of neighbors for each user are dynamically selected according to the similarities with other users. Secondly, average deviations between pairs of relevant items are generated on the basis of ratings from neighbor users. At last, the object ratings are predicted by linear regression model. Experiments on the MovieLens dataset show that the proposed algorithm gives better recommendations and is more robust to data sparsity than SlopeOne. It also outper-forms other collaborative filtering algorithms on prediction accuracy.

Key words: collaborative filtering, recommendation system, k-nearest-neighborhood, data mining, knowledge discovery

摘要: 协同过滤是个性化推荐系统中的常用技术, 数据稀疏性是影响协同过滤算法预测精度的主要因素。SlopeOne算法利用线性回归模型解决数据稀疏性问题。基于用户相似度的k近邻方法可以优化参与预测的用户评分数据的质量。在SlopeOne算法的基础上, 提出了一种动态k近邻和SlopeOne相结合的算法。首先根据用户之间相似度的具体情况动态地为每个用户选择不同数目的近邻用户, 然后利用近邻用户的评分数据生成项目之间的平均偏差, 最后利用线性回归模型进行预测。在MovieLens数据集上的实验结果表明, 改进算法在预测精度上比原SlopeOne算法有所提高, 能适应数据稀疏度更低的推荐系统, 并且与其他协同过滤算法相比, 推荐精度也具有明显优势。

关键词: 协同过滤, 推荐系统, k近邻, 数据挖掘, 知识发现