基于PathSim的MOOCs知识概念推荐模型

doi:10.3778/j.issn.1673-9418.2305069

摘要/Abstract

摘要： 大规模开放在线课程提供大规模开放式在线学习平台，为推进现代教育发挥关键作用。然而，减少用户学习盲区和改善用户体验方面的研究仍具有挑战性：交互数据稀疏；难以扩展到大型推荐任务上；用户需求不单由用户喜好决定，还受到不同教师、课程影响；以统一的方式对课程学习事件中不同类型实体及关系进行建模并不妥靠。基于此，引入相关性度量，依据全图结构信息计算各边权重，提出采用相关性度量算法PathSim进行邻域采样的知识概念推荐模型PathSimSage。各实体间相关性得分可在本地离线计算，将神经网络与传播过程分离，保证神经网络的堆叠层数和传播过程的独立性，大幅减少模型所需训练时间。在公开的MoocCube数据集上进行了综合实验，PathSimSage降低了不相关的信息甚至噪声的影响，解决随机游走采样所引发的高度节点偏差问题，并在一定程度上缓解了过平滑效应。

关键词: 大规模开放在线课程, 图神经网络, 个性化课程推荐, 图卷积, 基于元路径的子图, 相似性度量

Abstract: Massive open online courses play a crucial role in advancing modern education by providing extensive open online learning platforms. However, there are still challenging aspects to consider when it comes to reducing user learning blind spots and improving the overall user experience. Firstly, interaction data are sparse. Secondly, scaling up to large-scale recommendation tasks is difficult. Thirdly, user needs are not solely determined by individual preferences, but are also influenced by different teachers and course materials. Fourthly, developing a unified model that can effectively represent different types of entities and relationships within course learning events is a challenging task. This paper introduces a relevance metric that computes the weights of edges by leveraging the structural information of entire graph. This paper presents the PathSimSage model (path-based similarity sampler and aggregate) for recommending knowledge concepts, utilizing the PathSim algorithm (path-based similarity) for neighborhood sampling. The relevance scores between entities are precomputed offline, which decouples the neural network from the propagation process. This decoupling maintains the independence of the network??s layered architecture from the propagation mechanism, thereby considerably reducing the training time of model. Through extensive experimentation on the publicly accessible MoocCube dataset, PathSimSage has shown to minimize the impact of irrelevant or noisy information, resolve the significant node bias induced by random walk sampling, and somewhat alleviate the issue of oversmoothing.

Key words: massive open online courses, graph neural networks, personalized course recommendations, graph convolution, metapath-based subgraphs, similarity measure

祝义, 居程程, 郝国生. 基于PathSim的MOOCs知识概念推荐模型[J]. 计算机科学与探索, 2024, 18(8): 2049-2064.

ZHU Yi, JU Chengcheng, HAO Guosheng. MOOCs Knowledge Concept Recommendation Model Based on PathSim[J]. Journal of Frontiers of Computer Science and Technology, 2024, 18(8): 2049-2064.

参考文献

[1] HUANG N, ZHANG J, BURTCH G, et al. Combating procrastination on massive online open courses via optimal calls to action[J]. Information Systems Research, 2021, 32(2): 301-317.
[2] ZHU M, SARI A, LEE M M. A systematic review of research methods and topics of the empirical MOOC literature (2014—2016)[J]. The Internet and Higher Education, 2018, 37: 31-39.
[3] SEATON D T, BERGNER Y, CHUANG I, et al. Who does what in a massive open online course?[J]. Communications of the ACM, 2014, 57(4): 58-65.
[4] KIZILCEC R F, PIECH C, SCHNEIDER E. Deconstructing disengagement: analyzing learner subpopulations in massive open online courses[C]//Proceedings of the 3rd International Conference on Learning Analytics and Knowledge, Leuven, Apr 8-12, 2013. New York: ACM, 2013: 170-179.
[5] JACOBSEN D Y. Dropping out or dropping in? A connectivist approach to understanding participants?? strategies in an e-learning MOOC pilot[J]. Technology, Knowledge and Learning, 2019, 24(1): 1-21.
[6] ADAMOPOULOS P. What makes a great MOOC? An inter-disciplinary analysis of student retention in online courses[C]//Proceedings of the 34th International Conference on Information Systems, Milano, Dec 15-18, 2013: 1-21.
[7] GOLDBERG D, NICHOLS D, OKI B M, et al. Using collaborative filtering to weave an information tapestry[J]. Communications of the ACM, 1992, 35(12): 61-70.
[8] WANG X, HE X, Wang M, et al. Neural graph collaborative filtering[C]//Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, Paris, Jul 21-25, 2019. New York: ACM, 2019: 165-174.
[9] GONG J, WANG S, WANG J, et al. Attentional graph convolutional networks for knowledge concept recommendation in MOOCs in a heterogeneous view[C]//Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval, Jul 25-30, 2020. New York: ACM, 2020: 79-88.
[10] PIAO G. Recommending knowledge concepts on MOOC platforms with meta-path-based representation learning[C]//Proceedings of the 14th International Conference on Educational Data Mining, Jun 29-Jul 2, 2021: 1-8.
[11] ZHANG M, ZHU J, WANG Z, et al. Providing personalized learning guidance in MOOCs by multi-source data analysis[J]. World Wide Web, 2018, 22(3): 1189-1219.
[12] ZHU Y, LU H, QIU P, et al. Heterogeneous teaching evaluation network based offline course recommendation with graph learning and tensor factorization[J]. Neurocomputing, 2020, 415: 84-95.
[13] ZHAO Z, ZHANG X, ZHOU H, et al. HetNERec: heterogeneous network embedding based recommendation[J]. Knowledge-Based Systems, 2020, 204: 106218.
[14] WELLING M, KIPF T N. Semi-supervised classification with graph convolutional networks[C]//Proceedings of the 2017 International Conference on Learning Representations, Toulon, Apr 24-26, 2017: 1-14.
[15] GAO H, WANG Z, JI S. Large-scale learnable graph convolutional networks[C]//Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, London, Aug 19-23, 2018. New York: ACM, 2018: 1416-1424.
[16] LI A, YANG B, HUO H, et al. Leveraging implicit relations for recommender systems[J]. Information Sciences, 2021, 579: 55-71.
[17] 居程程, 祝义. 采用局部子图嵌入的MOOCs知识概念推荐模型[J]. 计算机科学与探索, 2024, 18(1): 189-204.
JU C C, ZHU Y. Knowledge concept recommendation model for MOOCs with local subgraph embedding[J]. Journal of Frontiers of Computer Science and Technology, 2024, 18(1): 189-204.
[18] YING Z, YOU J, MORRIS C, et al. Hierarchical graph representation learning with differentiable pooling[C]//Advances in Neural Information Processing Systems 31, Montréal, Dec 3-8, 2018: 1-11.
[19] ZHANG M, CUI Z, NEUMANN M, et al. An end-to-end deep learning architecture for graph classification[C]//Proceedings of the 2018 AAAI Conference on Artificial Intelligence, Louisiana, Feb 2-7, 2018. Menlo Park: AAAI, 2018: 4438-4445.
[20] LEE J, LEE I, KANG J. Self-attention graph pooling[C]// Proceedings of the 2019 International Conference on Machine Learning, Long Beach, Jun 9-15, 2019: 3734-3743.
[21] ZHOU X, YI Y, JIA G. Path-RotatE: knowledge graph embedding by relational rotation of path in complex space[C]//Proceedings of the 2021 IEEE/CIC International Conference on Communications in China, Xiamen, Jul 28-30, 2021. Piscataway: IEEE, 2021: 905-910.
[22] ZHANG S, TAY Y, YAO L, et al. Quaternion knowledge graph embeddings[C]//Advances in Neural Information Processing Systems 32, Vancouver, Dec 8-14, 2019: 1-11.
[23] WANG C, PAN S, HU R, et al. Attributed graph clustering: a deep attentional embedding approach[EB/OL]. [2023-03-12]. https://arxiv.org/abs/1906.06532.
[24] WANG C, PAN S, CELINA P Y, et al. Deep neighbor-aware embedding for node clustering in attributed graphs[J]. Pattern Recognition, 2022, 122: 108230.
[25] RAHNAMAZADEH A, MEYBODI M R, KADKHODA M T. Node classification in social network by distributed learning automata[J]. Information Systems & Telecommunication, 2017, 2(18): 111.
[26] DONG B, AGGARWAL C C, PHILIP S Y. Transfer learning for network classification[C]//Proceedings of the 2019 International Joint Conference on Neural Networks, Budapest, Jul 14-19, 2019. Piscataway: IEEE, 2019: 1-8.
[27] TANG J, AGGARWAL C, LIU H. Node classification in signed social networks[C]//Proceedings of the 2016 SIAM International Conference on Data Mining, Miami, May 5-7,2016. Philadelphia: SIAM, 2016: 54-62.
[28] GILMER J, SCHOENHOLZ S S, RILEY P F, et al. Neural message passing for quantum chemistry[C]//Proceedings of the 2017 International Conference on Machine Learning, Sydney, Aug 6-11, 2017: 1263-1272.
[29] MINN S, YU Y, DESMARAIS M C, et al. Deep knowledge tracing and dynamic student classification for knowledge tracing[C]//Proceedings of the 2018 IEEE International Conference on Data Mining, Singapore, Nov 17-20, 2018. Piscataway: IEEE, 2018: 1182-1187.
[30] YEUNG C K, YEUNG D Y. Addressing two problems in deep knowledge tracing via prediction-consistent regularization[C]//Proceedings of the 5th Annual ACM Conference on Learning at Scale, New York, Jun 26-28, 2018. New York:ACM, 2018: 1-10.
[31] YEUNG C K. Deep-IRT: make deep learning based knowledge tracing explainable using item response theory[EB/OL].[2023-03-12]. https://arxiv.org/abs/1904.11738.
[32] WANG T I, TSAI K H, LEE M C, et al. Personalized learning objects recommendation based on the semantic-aware discovery and the learner preference pattern[J]. Journal of Educational Technology & Society, 2007, 10(3): 84-105.
[33] ZHANG M, CHEN Y. Inductive matrix completion based on graph neural networks[C]//Proceedings of the 2019 International Conference on Learning Representations, Apr 26-May 1, 2019: 1-14.
[34] AHMADI A H K. Memory-based graph networks[D]. Toronto: University of Toronto, 2020: 1-16.
[35] TANG J, QU M, WANG M, et al. LINE: large-scale information network embedding[C]//Proceedings of the 24th International Conference on World Wide Web, Florence, May 18-22, 2015: 1067-1077.
[36] BERG R V D, KIPF T N, WELLING M. Graph convolutional matrix completion[EB/OL]. [2023-03-12]. https://arxiv.org/abs/1706.02263.
[37] DONG Y, CHAWLA N V, SWAMI A. Metapath2vec: scalable representation learning for heterogeneous networks[C]//Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Halifax, Aug 13-17, 2017. New York: ACM, 2017: 135-144.
[38] SHI C, HU B, ZHAO W X, et al. Heterogeneous information network embedding for recommendation[J]. IEEE Transactions on Knowledge and Data Engineering, 2018, 31(2): 357-370.
[39] MIKOLOV T, CHEN K, CORRADO G, et al. Efficient estimation of word representations in vector space[EB/OL].[2023-03-12]. https://arxiv.org/abs/1301.3781.
[40] YING R, HE R, CHEN K, et al. Graph convolutional neural networks for web-scale recommender systems[C]//Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, London, Aug 19-23, 2018. New York: ACM, 2018: 974-983.
[41] PAL A, EKSOMBATCHAI C, ZHOU Y, et al. PinnerSage: multi-modal user embedding framework for recommendations at pinterest[C]//Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Aug 23-27, 2020. New York: ACM, 2020: 2311-2320.
[42] ELBADRAWY A, KARYPIS G. Domain-aware grade prediction and top-n course recommendation[C]//Proceedings of the 10th ACM Conference on Recommender Systems, Boston, Sep 15-19, 2016. New York: ACM, 2016: 183-190.
[43] SYMEONIDIS P, MALAKOUDIS D. moocRec.com: massive open online courses recommender system[C]//Proceedings of the Poster Track of the 10th ACM Conference on Recommender Systems, Boston, Sep 17, 2016: 1688.
[44] WANG S, CHEN Z, LI D, et al. Attentional heterogeneous graph neural network: application to program reidentification[C]//Proceedings of the 2019 SIAM International Conference on Data Mining, Calgary, May 2-4, 2019. Philadelphia: SIAM, 2019: 693-701.
[45] GORI M, MONFARDINI G, SCARSELLI F. A new model for learning in graph domains[C]//Proceedings of the 2005 IEEE International Joint Conference on Neural Networks, Montreal, Jul 31-Aug 4, 2005. Piscataway: IEEE, 2005: 729-734.
[46] CHEN T, SUN Y. Task-guided and path-augmented heterogeneous network embedding for author identification[C]//Proceedings of the 10th ACM International Conference on Web Search and Data Mining, Cambridge, Feb 6-10, 2017. New York: ACM, 2017: 295-304.
[47] YU J, LUO G, XIAO T, et al. MOOCCube: a large-scale data repository for NLP applications in MOOCs[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Jul 5-10, 2020. Stroudsburg: ACL, 2020: 3135-3142.
[48] GROVER A, LESKOVEC J. node2vec: scalable feature learning for networks[C]//Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, Aug 13-17, 2016. New York: ACM, 2016: 855-864.
[49] RENDLE S, FREUDENTHALER C, GANTNER Z, et al. BPR: Bayesian personalized ranking from implicit feedback[C]//Proceedings of the 25th Conference on Uncertainty in Artificial Intelligence, Montreal, Jun 18-21, 2009. New York: ACM, 2009: 452-461.
[50] HE X, DENG K, WANG X, et al. LightGCN: simplifying and powering graph convolution network for recommendation[C]//Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval, Jul 25-30, 2020. New York: ACM, 2020: 639-648.
[51] WANG M, ZHENG D, YE Z, et al. Deep graph library: a graph-centric, highly-performant package for graph neural networks[EB/OL]. [2023-03-12]. https://arxiv.org/abs/1909.01315.