计算机科学与探索 ›› 2019, Vol. 13 ›› Issue (7): 1114-1122.DOI: 10.3778/j.issn.1673-9418.1806023

• 学术研究 • 上一篇    下一篇

混合秩矩阵分解模型

李幸幸,刘华锋,景丽萍+   

  1. 北京交通大学 计算机与信息技术学院,北京 100044
  • 出版日期:2019-07-01 发布日期:2019-07-08

Mixture Rank Matrix Factorization Model

LI Xingxing, LIU Huafeng, JING Liping+   

  1. School of Computer and Information Technology, Beijing Jiaotong University, Beijing 100044, China
  • Online:2019-07-01 Published:2019-07-08

摘要: 随着推荐系统的发展,矩阵近似算法成为研究热点,而以概率矩阵分解为代表的低秩矩阵近似模型因其具有较高的推荐精度而广受关注。但是,随着大数据时代的到来,评分矩阵越来越复杂,简单的单个矩阵近似模型会使一些隐藏在数据中的信息被忽视。为了解决这个问题,提出了一种基于boosting框架的混合秩矩阵近似算法(mixture rank matrix factorization,MRMF)。该算法基于boosting框架融合多个不同秩矩阵获取丰富的评分信息。具体方法为首先从整体结构出发,获取矩阵的整体信息,然后基于boosting求偏差获得残差矩阵,抓取局部的相关性。同时为了更好地学习局部特征,引入服从拉普拉斯先验分布的样本权重,构建自适应权重的概率矩阵模型(adaptive weight matrix factorization,AWMF)。在获取残差矩阵之后,通过EM算法学习残差矩阵的权重,避免模型过拟合以及减少人工调差的复杂度。实验结果验证,所提出的算法在四个真实数据集(Ciao、Epinions、Douban、Movielens(10M))上均具有较好的推荐精度。

关键词: 矩阵近似, 梯度提升, 自适应秩, 样本权重, 推荐系统, 权重矩阵分解

Abstract: With the development of recommendation system, matrix approximation algorithm has become a research hotspot, and low-rank matrix approximation model represented by probability matrix decomposition has attracted wide attention because of its high recommendation accuracy. However, with the arrival of the era of large data, scoring matrices become more and more complex. Simple single matrix approximation model will make some hidden information in data ignored. To solve this problem, a hybrid rank matrix factorization (MRMF) algorithm based on boosting framework is proposed. The algorithm combines multiple different rank matrices to obtain rich scoring information. The specific method is to obtain the overall information of the matrix from the overall structure, and then obtain the residual matrix based on boosting deviation to capture the local phase. At the same time, in order to learn local features better, sample weights obeying Laplacian prior distribution are introduced to construct an adaptive weight matrix factorization (AWMF). After obtaining the residual matrix, the weight of the residual matrix is learnt by EM algorithm to avoid over-fitting of the model and reduce the complexity of manual adjustment. The proposed method has good recommendation accuracy on four real data sets (Ciao, Epinions, Douban, Movielens (10M)).

Key words: matrix approximation, gradient boosting, adaptive rank, weighted sample, recommendation systems, weight matrix factorization