耦合用户公共特征的单类协同过滤推荐算法

doi:10.3778/j.issn.1673-9418.2009011

计算机科学与探索 ›› 2022, Vol. 16 ›› Issue (3): 637-648.DOI: 10.3778/j.issn.1673-9418.2009011

耦合用户公共特征的单类协同过滤推荐算法

张全贵⁺(), 胡嘉燕, 王丽

辽宁工程技术大学电子与信息工程学院,辽宁葫芦岛 125105

收稿日期:2020-09-07 修回日期:2020-11-06 出版日期:2022-03-01 发布日期:2020-11-19
通讯作者: + E-mail: zhqgui@126.com
作者简介:张全贵（1978—）,男,河北秦皇岛人,博士,副教授,CCF会员,主要研究方向为深度学习、推荐系统。
胡嘉燕（1993—）,女,广东江门人,硕士研究生,主要研究方向为深度学习、推荐系统。
王丽（1994—）,女,辽宁阜新人,硕士研究生,主要研究方向为深度学习、推荐系统。
基金资助:
辽宁省自然科学基金指导计划项目(20180550995);辽宁省教育厅科学技术项目(LJ2019JL009)

One Class Collaborative Filtering Recommendation Algorithm Coupled with User Common Characteristics

ZHANG Quangui⁺(), HU Jiayan, WANG Li

School of Electronic and Information Engineering, Liaoning Technical University, Huludao, Liaoning 125105, China

Received:2020-09-07 Revised:2020-11-06 Online:2022-03-01 Published:2020-11-19
About author:ZHANG Quangui, born in 1978, Ph.D., associate professor, member of CCF. His research interests include deep learning and recommendation system.
HU Jiayan, born in 1993, M.S. candidate. Her research interests include deep learning and recommendation system.
WANG Li, born in 1994, M.S. candidate. Her research interests include deep learning and recommendation system.
Supported by:
Natural Science Foundation of Liaoning Province(20180550995);Science and Technology Project of Liaoning Provincial Department of Education(LJ2019JL009)

摘要/Abstract

摘要：

将显式特征与隐式反馈相结合是提高单类协同过滤（OCCF）推荐准确性的常用方法。但目前的研究一般是直接将原始显式特征或交叉特征集成到OCCF模型中,因其难以判断哪些显式特征是真正重要的,故很难获得显著的性能改进。基于此,提出了一种耦合用户公共特征的单类协同过滤推荐算法（UCC-OCCF）。首先,建立基于邻居的共同偏好表示网络（NB-CPR）,学习与当前用户具有相似显式特征的邻居用户和某一类项目之间的交互关系,间接利用显式特征以获得共同偏好;然后,建立个人深度潜在因素表示网络（DLFR）,使用深度神经网络学习用户-项目之间的潜在因素,从而得到当前用户与项目之间的交互概率;最后,基于邻居的共同偏好表示网络与个人深度潜在因素表示网络进行联合训练,从而将用户公共特征耦合到单类协同过滤推荐模型中,以提高推荐准确度。在公共数据集MovieLens 100K、MovieLens 1M和MyAnimelist上的实验结果表明,UCC-OCCF可以显著提高OCCF的推荐准确性。

关键词: 单类协同过滤（OCCF）, 深度学习, 共同偏好, 隐式反馈, 显式特征

Abstract:

Combining explicit features with implicit feedback is a common method to improve the recommendation accuracy of one class collaborative filtering (OCCF). However, current studies generally integrate the original explicit features or cross features directly into OCCF models, which makes it difficult to determine which explicit features are really vital, so it is untoward to achieve significant performance improvement. To sum up, a one class collaborative filtering recommendation algorithm coupled with user common characteristics (UCC-OCCF) is proposed. First, the neighbor-based common preference representation network (NB-CPR) is established to learn the interaction between users with similar explicit characteristics as the current users and a certain type of item, and to indirectly use explicit characteristics to obtain common preferences. Then, the deep latent factors representation (DLFR) uses a deep neural network to learn the potential factors between the user and the item, thus obtaining the interaction probability between the current user and the item. At last, the NB-CPR is combined with the personal depth latent factor representation network for training, so as to couple the common characteristics of users into OCCF model to improve the recommendation accuracy. Experimental results on public datasets MovieLens 100K, MovieLens 1M and MyAnimelist, show that UCC-OCCF can significantly improve the recommendation accuracy of OCCF.

Key words: one-class collaborative filtering (OCCF), deep learning, common preferences, implicit feedback, explicit feature

中图分类号:

TP18

张全贵, 胡嘉燕, 王丽. 耦合用户公共特征的单类协同过滤推荐算法[J]. 计算机科学与探索, 2022, 16(3): 637-648.

ZHANG Quangui, HU Jiayan, WANG Li. One Class Collaborative Filtering Recommendation Algorithm Coupled with User Common Characteristics[J]. Journal of Frontiers of Computer Science and Technology, 2022, 16(3): 637-648.

图/表 17

图1 基于隐式（单类）反馈信息的用户-项目交互矩阵

Fig.1 User-item interaction matrix with implicit (one-class) feedback

表1 数学符号

Table 1 Mathematical notations

符号	说明
$U$	用户集
$V$	项目集
$u$	用户集 $U$ 的一个实例
$v$	项目集 $V$ 的一个实例
$p$	用户 $u$ 的潜在向量
$q$	项目 $v$ 的潜在向量
$p i$	向量 $p$ 的第 $i$ 个元素
$q j$	向量 $q$ 的第 $j$ 个元素
$W$	全连接层的权值矩阵
$W N B i$	第 $i$ 个邻居的权重
$N B i$	当前用户 $u$ 的第 $i$ 个邻居
$y (i)$	训练/测试数据集的第 $i$ 个样本的标签
$y ⌢ (i)$	测试数据集的第 $i$ 个样本的预测值

表1 数学符号

Table 1 Mathematical notations

符号	说明
$U$	用户集
$V$	项目集
$u$	用户集 $U$ 的一个实例
$v$	项目集 $V$ 的一个实例
$p$	用户 $u$ 的潜在向量
$q$	项目 $v$ 的潜在向量
$p i$	向量 $p$ 的第 $i$ 个元素
$q j$	向量 $q$ 的第 $j$ 个元素
$W$	全连接层的权值矩阵
$W N B i$	第 $i$ 个邻居的权重
$N B i$	当前用户 $u$ 的第 $i$ 个邻居
$y (i)$	训练/测试数据集的第 $i$ 个样本的标签
$y ⌢ (i)$	测试数据集的第 $i$ 个样本的预测值

图2 UCC-OCCF框架

Fig.2 UCC-OCCF framework

图3 个人深度潜在因素表示网络（DLFR）

Fig.3 Deep latent factor representation (DLFR）

图4 用户相似矩阵和邻居

Fig.4 User similarity matrix and neighbors

表2 数据集统计

Table 2 Statistics of datasets

数据集	用户	项目	交互
MovieLens 100K	943	1 682	100 000
MovieLens 1M	6 040	3 592	1 000 209
MyAnimeList	9 130	14 478	2 945 242

表3 用户和项目的显式特征

Table 3 Explicit features of users and items

类别	可见的特性
用户	性别、年龄、职业、邮编
项目	标题类型

表4 测试超参数

Table 4 Test hyper-parameters

测试参数	测试参数具体值
嵌入层的维度	{8,16,32,64,128,256}
最后隐藏层维度	{8,16,32,64,128,256}
学习率	{0.000 1,0.000 5,0.001 0,0.003 0,0.005 0,0.010 0}
批处理大小	{128,256,512,1 024}
用户邻居数量	{10,20,30,40,50,100}

图5 HR@K评估

Fig.5 Evaluation of HR@K

图6 NDCG@K评估

Fig.6 Evaluation of NDCG@K

图7 MRR评估

Fig.7 Evaluation of MRR

图8 评价邻居数量对HR@10的影响

Fig.8 Evaluation of HR@10 using different number of neighbors

图9 评价邻居数量对NDCG@10的影响

Fig.9 Evaluation of NDCG@10 using different number of neighbors

图10 评价邻居数量对MRR的影响

Fig.10 Evaluation of MRR using different number of neighbors

图11 模型结构评估

Fig.11 Evaluation of model structure

表5 训练和预测时间

Table 5 Time for training and prediction

数据集	时长/s	itemKNN	Wide&Deep	NeuMF	NGCF	DeepICF	UCC-OCCF-COS	UCC-OCCF-HAM	UCC-OCCF-Pear
MovieLens 100K	训练（epoch）	12.3	24.5	13.1	14.4	20.3	25.7	20.6	28.6
	训练总时长	293.4	575.8	510.6	432.6	512.4	421.3	389.6	458.6
	预测总时长	1.5	2.0	1.6	3.9	2.0	2.1	1.9	2.3
MovieLens 1M	训练（epoch）	125.1	325.1	208.1	805.5	308.6	425.3	456.2	430.0
	训练总时长	3 481.7	6 502.5	4 994.4	19 332.2	6 058.6	4 379.5	4 453.6	4 620.3
	预测总时长	11.6	18.6	11.3	21.6	13.0	15.6	20.0	17.3
MyAnimeList	训练（epoch）	389.4	653.4	442.9	1 970.1	555.2	582.4	568.4	545.9
	训练总时长	10 903.2	11 454.4	8 035.1	43 342.2	12 510.6	3 510.8	3 814.4	2 183.6
	预测总时长	12.5	12.5	13.9	37.9	20.1	18.8	17.8	15.3

表6 超参数

Table 6 Hyper-parameters

超参数	MovieLens 100K	MovieLens 1M	MyAnimeList
嵌入层的维度	8	64	64
最后隐藏层维度	8	256	256
学习率	0.000 5	0.001 0	0.000 1
批处理大小	256	256	512

参考文献 33

[1]	KOREN Y, BELL R M, VOLINSKY C. Matrix factorization techniques for recommender systems[J]. Computer, 2009, 42(8): 30-37.
[2]	PAN W K, CHEN L. Group Bayesian personalized ranking with rich interactions for one-class collaborative filtering[J]. Neurocomputing, 2016, 207: 501-510. DOI URL
[3]	RENDLE S, FREUDENTHALER C, GANTNER Z, et al. BPR: Bayesian personalized ranking from implicit feedback[J]. arXiv:1205.2618, 2012.
[4]	ZHOU W, LI J, ZHOU Y, et al. Bayesian pairwise learning to rank via one-class collaborative filtering[J]. Neurocomputing, 2019, 367: 176-187. DOI URL
[5]	俞春花, 刘学军, 李斌. 隐式反馈场景中融合社交信息的上下文感知推荐[J]. 计算机科学, 2016, 43(6): 248-253.
	YU C H, LIU X J, LI B. Implicit feedback personalized recommendation model fusing context-aware and social network process[J]. Computer Science, 2016, 43(6): 248-253.
[6]	YAO W L, HE J, HUANG G Y, et al. A graph-based model for context-aware recommendation using implicit feedback data[J]. World Wide Web, 2015, 18(5): 1351-1371. DOI URL
[7]	PAN R, ZHOU Y H, CAO B, et al. One-class collaborative filtering[C]// Proceedings of the 8th IEEE International Conference on Data Mining, Pisa, Dec 15-19, 2008. Washington: IEEE Computer Society, 2008: 502-511.
[8]	PAN W K, YANG Q, CAI W L, et al. Transfer to rank for heterogeneous one-class collaborative filtering[J]. ACM Transactions on Information Systems, 2019, 37(1): 1-20.
[9]	JAVED F, HAYAT M. Predicting subcellular localization of multi-label proteins by incorporating the sequence features into Chou’s PseAAC[J]. Genomics, 2019, 111(6): 1325-1332. DOI URL
[10]	CHENG H T, KOC L, HARMSEN J, et al. Wide & deep learning for recommender systems[C]// Proceedings of the 1st Workshop on Deep Learning for Recommender Systems, Boston, Sep 15, 2016. New York: ACM, 2016: 7-10.
[11]	GUO H, TANG R, YE Y, et al. DeepFM: a factorization-machine based neural network for CTR prediction[J]. arXiv:1703.04247, 2017.
[12]	BAG S, KUMAR S K, TIWARI M K. An efficient recommendation generation using relevant Jaccard similarity[J]. Information Sciences, 2019, 483: 53-64. DOI URL
[13]	HE X N, LIAO L Z, ZHANG H W, et al. Neural collaborative filtering[C]// Proceedings of the 26th International Conference on World Wide Web, Perth, Apr 3-7, 2017. New York: ACM, 2017: 173-182.
[14]	WANG N Y, YEUNG D Y. Learning a deep compact image representation for visual tracking[C]// Proceedings of the 27th Annual Conference on Neural Information Processing Systems, Lake Tahoe, Dec 5-8, 2013. Red Hook: Curran Associates, 2013: 809-817.
[15]	KALCHBRENNER N, GREFENSTETTE E, BLUNSOM P. A convolutional neural network for modelling sentences[J]. arXiv:1404. 2188, 2014.
[16]	VAN DEN OORD A, DIELEMAN S, SCHRAUWEN B. Deep content-based music recommendation[C]// Proceedings of the 27th Annual Conference on Neural Information Processing Systems, Lake Tahoe, Dec 5-8, 2013. Red Hook: Curran Associates, 2013: 2643-2651.
[17]	MIRBAKHSH N, LING C X. Leveraging clustering to improve collaborative filtering[J]. Information Systems Frontiers, 2018, 20(1): 111-124. DOI URL
[18]	SIDANA S, TROFIMOV M, HORODNITSKII O, et al. Representation learning and pairwise ranking for implicit feedback in recommendation systems[J]. arXiv:1705.00105, 2017.
[19]	ELKAHKY A M, SONG Y, HE X D. A multi-view deep learning approach for cross domain user modeling in recommendation systems[C]// Proceedings of the 24th International Conference on World Wide Web, Florence, May 18-22, 2015. New York: ACM, 2015: 278-288.
[20]	SEDHAIN S, BUI H H, KAWALE J, et al. Practical linear models for large-scale one-class collaborative filtering[C]// Proceedings of the 25th International Joint Conference on Artificial Intelligence, New York, Jul 9-15, 2016. Menlo Park: AAAI, 2016: 3854-3860.
[21]	MIKOLOV T, CHEN K, CORRADO G, et al. Efficient estimation of word representations in vector space[C]// Proceedings of the 1st International Conference on Learning Representations, Scottsdale, May 2-4, 2013: 1301-3781.
[22]	MIKOLOV T, SUTSKEVER I, CHEN K, et al. Distributed representations of words and phrases and their compositionality[C]// Proceedings of the 27th Annual Conference on Neural Information Processing Systems, Lake Tahoe, Dec 5-8, 2013. Red Hook: Curran Associates, 2013: 3111-3119.
[23]	WANG C, DONG X J, ZHOU F, et al. Coupled attribute similarity learning on categorical data[J]. IEEE Transactions on Neural Networks and Learning Systems, 2015, 26(4): 781-797. DOI URL
[24]	LIU M S, PAN W K, LIU M, et al. Mixed similarity learning for recommendation with implicit feedback[J]. Knowledge Based Systems, 2017, 119: 178-185. DOI URL
[25]	HARPER F M, KONSTAN J A. The MovieLens datasets: history and context[J]. ACM Transactions on Interactive Intelligent Systems, 2016, 5(4): 1-19.
[26]	SARWAR B M, KARYPIS G, KONSTAN J A, et al. Item-based collaborative filtering recommendation algorithms[C]// Proceedings of the 10th International World Wide Web Conference, Hong Kong, China, May 1-5, 2001. New York: ACM, 2001: 285-295.
[27]	WANG X, HE X N, WANG M, et al. Neural graph collaborative filtering[C]// Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, Paris, Jul 21-25, 2019. New York: ACM, 2019: 165-174.
[28]	XUE F, HE X N, WANG X, et al. Deep item-based collaborative filtering for top-N recommendation[J]. ACM Transactions on Information Systems, 2019, 37(3): 1-25.
[29]	HE X N, ZHANG H W, KAN M Y, et al. Fast matrix factorization for online recommendation with implicit feedback[C]// Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval, Pisa, Jul 17-21, 2016. New York: ACM, 2016: 549-558.
[30]	BAYER I, HE X N, KANAGAL B, et al. A generic coordinate descent framework for learning from implicit feedback[C]// Proceedings of the 26th International Conference on World Wide Web, Perth, Apr 3-7, 2017. New York: ACM, 2017: 1341-1350.
[31]	KOREN Y. Factorization meets the neighborhood: a multifaceted collaborative filtering model[C]// Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Las Vegas, Aug 24-27, 2008. New York: ACM, 2008: 426-434.
[32]	HE X N, CHEN T, KAN M Y, et al. TriRank: review-aware explainable recommendation by modeling aspects[C]// Proceedings of the 24th ACM International Conference on Information and Knowledge Management, Melbourne, Oct 19-23, 2015. New York: ACM, 2015: 1661-1670.
[33]	SAKIB N, AHMAD R B, HARUNA K. A collaborative approach toward scientific paper recommendation using citation context[J]. IEEE Access, 2020, 8: 51246-51255. DOI URL

耦合用户公共特征的单类协同过滤推荐算法

One Class Collaborative Filtering Recommendation Algorithm Coupled with User Common Characteristics

RichHTML

PDF

可视化

摘要/Abstract

引用本文

使用本文

图/表 17

参考文献 33

相关文章 15

编辑推荐

Metrics

[1]	安凤平, 李晓薇, 曹翔. 权重初始化-滑动窗口CNN的医学图像分类[J]. 计算机科学与探索, 2022, 16(8): 1885-1897.
[2]	曾凡智, 许露倩, 周燕, 周月霞, 廖俊玮. 面向智慧教育的知识追踪模型研究综述[J]. 计算机科学与探索, 2022, 16(8): 1742-1763.
[3]	刘艺, 李蒙蒙, 郑奇斌, 秦伟, 任小广. 视频目标跟踪算法综述[J]. 计算机科学与探索, 2022, 16(7): 1504-1515.
[4]	赵小明, 杨轶娇, 张石清. 面向深度学习的多模态情感识别研究进展[J]. 计算机科学与探索, 2022, 16(7): 1479-1503.
[5]	夏鸿斌, 肖奕飞, 刘渊. 融合自注意力机制的长文本生成对抗网络模型[J]. 计算机科学与探索, 2022, 16(7): 1603-1610.
[6]	孙方伟, 李承阳, 谢永强, 李忠博, 杨才东, 齐锦. 深度学习应用于遮挡目标检测算法综述[J]. 计算机科学与探索, 2022, 16(6): 1243-1259.
[7]	刘雅芬, 郑艺峰, 江铃燚, 李国和, 张文杰. 深度半监督学习中伪标签方法综述[J]. 计算机科学与探索, 2022, 16(6): 1279-1290.
[8]	程卫月, 张雪琴, 林克正, 李骜. 融合全局与局部特征的深度卷积神经网络算法[J]. 计算机科学与探索, 2022, 16(5): 1146-1154.
[9]	钟梦圆, 姜麟. 超分辨率图像重建算法综述[J]. 计算机科学与探索, 2022, 16(5): 972-990.
[10]	许嘉, 韦婷婷, 于戈, 黄欣悦, 吕品. 题目难度评估方法研究综述[J]. 计算机科学与探索, 2022, 16(4): 734-759.
[11]	裴利沈, 赵雪专. 群体行为识别深度学习方法研究综述[J]. 计算机科学与探索, 2022, 16(4): 775-790.
[12]	朱伟杰, 陈莹. 双流时间域信息交互的微表情识别卷积网络[J]. 计算机科学与探索, 2022, 16(4): 950-958.
[13]	姜艺, 胥加洁, 柳絮, 朱俊武. 边缘指导图像修复算法研究[J]. 计算机科学与探索, 2022, 16(3): 669-682.
[14]	刘颖, 郭莹莹, 房杰, 范九伦, 郝羽, 刘继明. 深度学习跨模态图文检索研究综述[J]. 计算机科学与探索, 2022, 16(3): 489-511.
[15]	马金林, 张裕, 马自萍, 毛凯绩. 轻量化神经网络卷积设计研究进展[J]. 计算机科学与探索, 2022, 16(3): 512-528.