计算机科学与探索 ›› 2022, Vol. 16 ›› Issue (3): 637-648.DOI: 10.3778/j.issn.1673-9418.2009011
收稿日期:
2020-09-07
修回日期:
2020-11-06
出版日期:
2022-03-01
发布日期:
2020-11-19
通讯作者:
+ E-mail: zhqgui@126.com作者简介:
张全贵(1978—),男,河北秦皇岛人,博士,副教授,CCF会员,主要研究方向为深度学习、推荐系统。基金资助:
ZHANG Quangui+(), HU Jiayan, WANG Li
Received:
2020-09-07
Revised:
2020-11-06
Online:
2022-03-01
Published:
2020-11-19
About author:
ZHANG Quangui, born in 1978, Ph.D., associate professor, member of CCF. His research interests include deep learning and recommendation system.Supported by:
摘要:
将显式特征与隐式反馈相结合是提高单类协同过滤(OCCF)推荐准确性的常用方法。但目前的研究一般是直接将原始显式特征或交叉特征集成到OCCF模型中,因其难以判断哪些显式特征是真正重要的,故很难获得显著的性能改进。基于此,提出了一种耦合用户公共特征的单类协同过滤推荐算法(UCC-OCCF)。首先,建立基于邻居的共同偏好表示网络(NB-CPR),学习与当前用户具有相似显式特征的邻居用户和某一类项目之间的交互关系,间接利用显式特征以获得共同偏好;然后,建立个人深度潜在因素表示网络(DLFR),使用深度神经网络学习用户-项目之间的潜在因素,从而得到当前用户与项目之间的交互概率;最后,基于邻居的共同偏好表示网络与个人深度潜在因素表示网络进行联合训练,从而将用户公共特征耦合到单类协同过滤推荐模型中,以提高推荐准确度。在公共数据集MovieLens 100K、MovieLens 1M和MyAnimelist上的实验结果表明,UCC-OCCF可以显著提高OCCF的推荐准确性。
中图分类号:
张全贵, 胡嘉燕, 王丽. 耦合用户公共特征的单类协同过滤推荐算法[J]. 计算机科学与探索, 2022, 16(3): 637-648.
ZHANG Quangui, HU Jiayan, WANG Li. One Class Collaborative Filtering Recommendation Algorithm Coupled with User Common Characteristics[J]. Journal of Frontiers of Computer Science and Technology, 2022, 16(3): 637-648.
符号 | 说明 |
---|---|
| 用户集 |
| 项目集 |
| 用户集 |
| 项目集 |
| 用户 |
| 项目 |
| 向量 |
| 向量 |
| 全连接层的权值矩阵 |
| 第 |
| 当前用户 |
| 训练/测试数据集的第 |
| 测试数据集的第 |
表1 数学符号
Table 1 Mathematical notations
符号 | 说明 |
---|---|
| 用户集 |
| 项目集 |
| 用户集 |
| 项目集 |
| 用户 |
| 项目 |
| 向量 |
| 向量 |
| 全连接层的权值矩阵 |
| 第 |
| 当前用户 |
| 训练/测试数据集的第 |
| 测试数据集的第 |
数据集 | 用户 | 项目 | 交互 |
---|---|---|---|
MovieLens 100K | 943 | 1 682 | 100 000 |
MovieLens 1M | 6 040 | 3 592 | 1 000 209 |
MyAnimeList | 9 130 | 14 478 | 2 945 242 |
表2 数据集统计
Table 2 Statistics of datasets
数据集 | 用户 | 项目 | 交互 |
---|---|---|---|
MovieLens 100K | 943 | 1 682 | 100 000 |
MovieLens 1M | 6 040 | 3 592 | 1 000 209 |
MyAnimeList | 9 130 | 14 478 | 2 945 242 |
类别 | 可见的特性 |
---|---|
用户 | 性别、年龄、职业、邮编 |
项目 | 标题类型 |
表3 用户和项目的显式特征
Table 3 Explicit features of users and items
类别 | 可见的特性 |
---|---|
用户 | 性别、年龄、职业、邮编 |
项目 | 标题类型 |
测试参数 | 测试参数具体值 |
---|---|
嵌入层的维度 | {8,16,32,64,128,256} |
最后隐藏层维度 | {8,16,32,64,128,256} |
学习率 | {0.000 1,0.000 5,0.001 0,0.003 0,0.005 0,0.010 0} |
批处理大小 | {128,256,512,1 024} |
用户邻居数量 | {10,20,30,40,50,100} |
表4 测试超参数
Table 4 Test hyper-parameters
测试参数 | 测试参数具体值 |
---|---|
嵌入层的维度 | {8,16,32,64,128,256} |
最后隐藏层维度 | {8,16,32,64,128,256} |
学习率 | {0.000 1,0.000 5,0.001 0,0.003 0,0.005 0,0.010 0} |
批处理大小 | {128,256,512,1 024} |
用户邻居数量 | {10,20,30,40,50,100} |
数据集 | 时长/s | itemKNN | Wide&Deep | NeuMF | NGCF | DeepICF | UCC-OCCF-COS | UCC-OCCF-HAM | UCC-OCCF-Pear |
---|---|---|---|---|---|---|---|---|---|
MovieLens 100K | 训练(epoch) | 12.3 | 24.5 | 13.1 | 14.4 | 20.3 | 25.7 | 20.6 | 28.6 |
训练总时长 | 293.4 | 575.8 | 510.6 | 432.6 | 512.4 | 421.3 | 389.6 | 458.6 | |
预测总时长 | 1.5 | 2.0 | 1.6 | 3.9 | 2.0 | 2.1 | 1.9 | 2.3 | |
MovieLens 1M | 训练(epoch) | 125.1 | 325.1 | 208.1 | 805.5 | 308.6 | 425.3 | 456.2 | 430.0 |
训练总时长 | 3 481.7 | 6 502.5 | 4 994.4 | 19 332.2 | 6 058.6 | 4 379.5 | 4 453.6 | 4 620.3 | |
预测总时长 | 11.6 | 18.6 | 11.3 | 21.6 | 13.0 | 15.6 | 20.0 | 17.3 | |
MyAnimeList | 训练(epoch) | 389.4 | 653.4 | 442.9 | 1 970.1 | 555.2 | 582.4 | 568.4 | 545.9 |
训练总时长 | 10 903.2 | 11 454.4 | 8 035.1 | 43 342.2 | 12 510.6 | 3 510.8 | 3 814.4 | 2 183.6 | |
预测总时长 | 12.5 | 12.5 | 13.9 | 37.9 | 20.1 | 18.8 | 17.8 | 15.3 |
表5 训练和预测时间
Table 5 Time for training and prediction
数据集 | 时长/s | itemKNN | Wide&Deep | NeuMF | NGCF | DeepICF | UCC-OCCF-COS | UCC-OCCF-HAM | UCC-OCCF-Pear |
---|---|---|---|---|---|---|---|---|---|
MovieLens 100K | 训练(epoch) | 12.3 | 24.5 | 13.1 | 14.4 | 20.3 | 25.7 | 20.6 | 28.6 |
训练总时长 | 293.4 | 575.8 | 510.6 | 432.6 | 512.4 | 421.3 | 389.6 | 458.6 | |
预测总时长 | 1.5 | 2.0 | 1.6 | 3.9 | 2.0 | 2.1 | 1.9 | 2.3 | |
MovieLens 1M | 训练(epoch) | 125.1 | 325.1 | 208.1 | 805.5 | 308.6 | 425.3 | 456.2 | 430.0 |
训练总时长 | 3 481.7 | 6 502.5 | 4 994.4 | 19 332.2 | 6 058.6 | 4 379.5 | 4 453.6 | 4 620.3 | |
预测总时长 | 11.6 | 18.6 | 11.3 | 21.6 | 13.0 | 15.6 | 20.0 | 17.3 | |
MyAnimeList | 训练(epoch) | 389.4 | 653.4 | 442.9 | 1 970.1 | 555.2 | 582.4 | 568.4 | 545.9 |
训练总时长 | 10 903.2 | 11 454.4 | 8 035.1 | 43 342.2 | 12 510.6 | 3 510.8 | 3 814.4 | 2 183.6 | |
预测总时长 | 12.5 | 12.5 | 13.9 | 37.9 | 20.1 | 18.8 | 17.8 | 15.3 |
超参数 | MovieLens 100K | MovieLens 1M | MyAnimeList |
---|---|---|---|
嵌入层的维度 | 8 | 64 | 64 |
最后隐藏层维度 | 8 | 256 | 256 |
学习率 | 0.000 5 | 0.001 0 | 0.000 1 |
批处理大小 | 256 | 256 | 512 |
表6 超参数
Table 6 Hyper-parameters
超参数 | MovieLens 100K | MovieLens 1M | MyAnimeList |
---|---|---|---|
嵌入层的维度 | 8 | 64 | 64 |
最后隐藏层维度 | 8 | 256 | 256 |
学习率 | 0.000 5 | 0.001 0 | 0.000 1 |
批处理大小 | 256 | 256 | 512 |
[1] | KOREN Y, BELL R M, VOLINSKY C. Matrix factorization techniques for recommender systems[J]. Computer, 2009, 42(8): 30-37. |
[2] |
PAN W K, CHEN L. Group Bayesian personalized ranking with rich interactions for one-class collaborative filtering[J]. Neurocomputing, 2016, 207: 501-510.
DOI URL |
[3] | RENDLE S, FREUDENTHALER C, GANTNER Z, et al. BPR: Bayesian personalized ranking from implicit feedback[J]. arXiv:1205.2618, 2012. |
[4] |
ZHOU W, LI J, ZHOU Y, et al. Bayesian pairwise learning to rank via one-class collaborative filtering[J]. Neurocomputing, 2019, 367: 176-187.
DOI URL |
[5] | 俞春花, 刘学军, 李斌. 隐式反馈场景中融合社交信息的上下文感知推荐[J]. 计算机科学, 2016, 43(6): 248-253. |
YU C H, LIU X J, LI B. Implicit feedback personalized recommendation model fusing context-aware and social network process[J]. Computer Science, 2016, 43(6): 248-253. | |
[6] |
YAO W L, HE J, HUANG G Y, et al. A graph-based model for context-aware recommendation using implicit feedback data[J]. World Wide Web, 2015, 18(5): 1351-1371.
DOI URL |
[7] | PAN R, ZHOU Y H, CAO B, et al. One-class collaborative filtering[C]// Proceedings of the 8th IEEE International Conference on Data Mining, Pisa, Dec 15-19, 2008. Washington: IEEE Computer Society, 2008: 502-511. |
[8] | PAN W K, YANG Q, CAI W L, et al. Transfer to rank for heterogeneous one-class collaborative filtering[J]. ACM Transactions on Information Systems, 2019, 37(1): 1-20. |
[9] |
JAVED F, HAYAT M. Predicting subcellular localization of multi-label proteins by incorporating the sequence features into Chou’s PseAAC[J]. Genomics, 2019, 111(6): 1325-1332.
DOI URL |
[10] | CHENG H T, KOC L, HARMSEN J, et al. Wide & deep learning for recommender systems[C]// Proceedings of the 1st Workshop on Deep Learning for Recommender Systems, Boston, Sep 15, 2016. New York: ACM, 2016: 7-10. |
[11] | GUO H, TANG R, YE Y, et al. DeepFM: a factorization-machine based neural network for CTR prediction[J]. arXiv:1703.04247, 2017. |
[12] |
BAG S, KUMAR S K, TIWARI M K. An efficient recommendation generation using relevant Jaccard similarity[J]. Information Sciences, 2019, 483: 53-64.
DOI URL |
[13] | HE X N, LIAO L Z, ZHANG H W, et al. Neural collaborative filtering[C]// Proceedings of the 26th International Conference on World Wide Web, Perth, Apr 3-7, 2017. New York: ACM, 2017: 173-182. |
[14] | WANG N Y, YEUNG D Y. Learning a deep compact image representation for visual tracking[C]// Proceedings of the 27th Annual Conference on Neural Information Processing Systems, Lake Tahoe, Dec 5-8, 2013. Red Hook: Curran Associates, 2013: 809-817. |
[15] | KALCHBRENNER N, GREFENSTETTE E, BLUNSOM P. A convolutional neural network for modelling sentences[J]. arXiv:1404. 2188, 2014. |
[16] | VAN DEN OORD A, DIELEMAN S, SCHRAUWEN B. Deep content-based music recommendation[C]// Proceedings of the 27th Annual Conference on Neural Information Processing Systems, Lake Tahoe, Dec 5-8, 2013. Red Hook: Curran Associates, 2013: 2643-2651. |
[17] |
MIRBAKHSH N, LING C X. Leveraging clustering to improve collaborative filtering[J]. Information Systems Frontiers, 2018, 20(1): 111-124.
DOI URL |
[18] | SIDANA S, TROFIMOV M, HORODNITSKII O, et al. Representation learning and pairwise ranking for implicit feedback in recommendation systems[J]. arXiv:1705.00105, 2017. |
[19] | ELKAHKY A M, SONG Y, HE X D. A multi-view deep learning approach for cross domain user modeling in recommendation systems[C]// Proceedings of the 24th International Conference on World Wide Web, Florence, May 18-22, 2015. New York: ACM, 2015: 278-288. |
[20] | SEDHAIN S, BUI H H, KAWALE J, et al. Practical linear models for large-scale one-class collaborative filtering[C]// Proceedings of the 25th International Joint Conference on Artificial Intelligence, New York, Jul 9-15, 2016. Menlo Park: AAAI, 2016: 3854-3860. |
[21] | MIKOLOV T, CHEN K, CORRADO G, et al. Efficient estimation of word representations in vector space[C]// Proceedings of the 1st International Conference on Learning Representations, Scottsdale, May 2-4, 2013: 1301-3781. |
[22] | MIKOLOV T, SUTSKEVER I, CHEN K, et al. Distributed representations of words and phrases and their compositionality[C]// Proceedings of the 27th Annual Conference on Neural Information Processing Systems, Lake Tahoe, Dec 5-8, 2013. Red Hook: Curran Associates, 2013: 3111-3119. |
[23] |
WANG C, DONG X J, ZHOU F, et al. Coupled attribute similarity learning on categorical data[J]. IEEE Transactions on Neural Networks and Learning Systems, 2015, 26(4): 781-797.
DOI URL |
[24] |
LIU M S, PAN W K, LIU M, et al. Mixed similarity learning for recommendation with implicit feedback[J]. Knowledge Based Systems, 2017, 119: 178-185.
DOI URL |
[25] | HARPER F M, KONSTAN J A. The MovieLens datasets: history and context[J]. ACM Transactions on Interactive Intelligent Systems, 2016, 5(4): 1-19. |
[26] | SARWAR B M, KARYPIS G, KONSTAN J A, et al. Item-based collaborative filtering recommendation algorithms[C]// Proceedings of the 10th International World Wide Web Conference, Hong Kong, China, May 1-5, 2001. New York: ACM, 2001: 285-295. |
[27] | WANG X, HE X N, WANG M, et al. Neural graph collaborative filtering[C]// Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, Paris, Jul 21-25, 2019. New York: ACM, 2019: 165-174. |
[28] | XUE F, HE X N, WANG X, et al. Deep item-based collaborative filtering for top-N recommendation[J]. ACM Transactions on Information Systems, 2019, 37(3): 1-25. |
[29] | HE X N, ZHANG H W, KAN M Y, et al. Fast matrix factorization for online recommendation with implicit feedback[C]// Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval, Pisa, Jul 17-21, 2016. New York: ACM, 2016: 549-558. |
[30] | BAYER I, HE X N, KANAGAL B, et al. A generic coordinate descent framework for learning from implicit feedback[C]// Proceedings of the 26th International Conference on World Wide Web, Perth, Apr 3-7, 2017. New York: ACM, 2017: 1341-1350. |
[31] | KOREN Y. Factorization meets the neighborhood: a multifaceted collaborative filtering model[C]// Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Las Vegas, Aug 24-27, 2008. New York: ACM, 2008: 426-434. |
[32] | HE X N, CHEN T, KAN M Y, et al. TriRank: review-aware explainable recommendation by modeling aspects[C]// Proceedings of the 24th ACM International Conference on Information and Knowledge Management, Melbourne, Oct 19-23, 2015. New York: ACM, 2015: 1661-1670. |
[33] |
SAKIB N, AHMAD R B, HARUNA K. A collaborative approach toward scientific paper recommendation using citation context[J]. IEEE Access, 2020, 8: 51246-51255.
DOI URL |
[1] | 安凤平, 李晓薇, 曹翔. 权重初始化-滑动窗口CNN的医学图像分类[J]. 计算机科学与探索, 2022, 16(8): 1885-1897. |
[2] | 曾凡智, 许露倩, 周燕, 周月霞, 廖俊玮. 面向智慧教育的知识追踪模型研究综述[J]. 计算机科学与探索, 2022, 16(8): 1742-1763. |
[3] | 刘艺, 李蒙蒙, 郑奇斌, 秦伟, 任小广. 视频目标跟踪算法综述[J]. 计算机科学与探索, 2022, 16(7): 1504-1515. |
[4] | 赵小明, 杨轶娇, 张石清. 面向深度学习的多模态情感识别研究进展[J]. 计算机科学与探索, 2022, 16(7): 1479-1503. |
[5] | 夏鸿斌, 肖奕飞, 刘渊. 融合自注意力机制的长文本生成对抗网络模型[J]. 计算机科学与探索, 2022, 16(7): 1603-1610. |
[6] | 孙方伟, 李承阳, 谢永强, 李忠博, 杨才东, 齐锦. 深度学习应用于遮挡目标检测算法综述[J]. 计算机科学与探索, 2022, 16(6): 1243-1259. |
[7] | 刘雅芬, 郑艺峰, 江铃燚, 李国和, 张文杰. 深度半监督学习中伪标签方法综述[J]. 计算机科学与探索, 2022, 16(6): 1279-1290. |
[8] | 程卫月, 张雪琴, 林克正, 李骜. 融合全局与局部特征的深度卷积神经网络算法[J]. 计算机科学与探索, 2022, 16(5): 1146-1154. |
[9] | 钟梦圆, 姜麟. 超分辨率图像重建算法综述[J]. 计算机科学与探索, 2022, 16(5): 972-990. |
[10] | 许嘉, 韦婷婷, 于戈, 黄欣悦, 吕品. 题目难度评估方法研究综述[J]. 计算机科学与探索, 2022, 16(4): 734-759. |
[11] | 裴利沈, 赵雪专. 群体行为识别深度学习方法研究综述[J]. 计算机科学与探索, 2022, 16(4): 775-790. |
[12] | 朱伟杰, 陈莹. 双流时间域信息交互的微表情识别卷积网络[J]. 计算机科学与探索, 2022, 16(4): 950-958. |
[13] | 姜艺, 胥加洁, 柳絮, 朱俊武. 边缘指导图像修复算法研究[J]. 计算机科学与探索, 2022, 16(3): 669-682. |
[14] | 刘颖, 郭莹莹, 房杰, 范九伦, 郝羽, 刘继明. 深度学习跨模态图文检索研究综述[J]. 计算机科学与探索, 2022, 16(3): 489-511. |
[15] | 马金林, 张裕, 马自萍, 毛凯绩. 轻量化神经网络卷积设计研究进展[J]. 计算机科学与探索, 2022, 16(3): 512-528. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||