Journal of Frontiers of Computer Science and Technology ›› 2022, Vol. 16 ›› Issue (7): 1594-1602.DOI: 10.3778/j.issn.1673-9418.2101045

• A.pngicial Intelligence • Previous Articles     Next Articles

Groupwise Learning to Rank Algorithm with Introduction of Activated Weighting

LI Yuxuan1,2,+(), HONG Xuehai1,3, WANG Yang1, TANG Zhengzheng1,2, BAN Yan1   

  1. 1. Computer Network Information Center, China Academy of Sciences, Beijing 100190, China
    2. University of China Academy of Sciences, Beijing 100049, China
    3. Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China
  • Received:2021-01-11 Revised:2021-03-16 Online:2022-07-01 Published:2021-04-09

引入激活加权策略的分组排序学习方法

李玉轩1,2,+(), 洪学海1,3, 汪洋1, 唐正正1,2, 班艳1   

  1. 1.中国科学院 计算机网络信息中心,北京 100190
    2.中国科学院大学,北京 100049
    3.中国科学院 计算技术研究所,北京 100190
  • 作者简介:李玉轩(1996—),男,江苏宿迁人,硕士研究生,主要研究方向为机器学习、信息检索等。
    LI Yuxuan, born in 1996, M.S. candidate. His research interests include machine learning, information retrieval, etc.
    洪学海(1967—),男,安徽巢湖人,博士,研究员,博士生导师,CCF杰出会员,主要研究方向为高性能计算、人工智能、大数据等。
    HONG Xuehai, born in 1967, Ph.D., professor, Ph.D. supervisor, outstanding member of CCF. His research interests include high performance computing, a.pngicial intelligence, big data, etc.
    汪洋(1984—),男,湖北武汉人,博士,高级工程师,硕士生导师,中国科学院计算机网络信息中心信息化发展战略与评估中心主任,主要研究方向为大数据分析、态势感知系统、信息化发展战略研究等。
    WANG Yang, born in 1984, Ph.D., senior engineer, M.S. supervisor. His research interests include big data analysis, situational awareness system, strategy research on informatization development, etc.
    唐正正(1992—),男,安徽怀远人,博士研究生,主要研究方向为机器学习、数据挖掘、图表示学习等。
    TANG Zhengzheng, born in 1992, Ph.D. candidate. His research interests include machine learning, data mining, graph representation learning, etc.
    班艳(1981—),女,北京人,高级工程师,主要研究方向为信息化发展战略研究。
    BAN Yan, born in 1981, senior engineer. Her research interest is strategy research on informatization development.

Abstract:

Learning to rank (LtR) applies supervised machine learning (SML) technologies to the ranking problems, aiming at optimizing the relevance of input document list. As regard to previous studies on the deep ranking model, the calculation of the relevance of the documents in the list is independent of each other, which lacks consideration of document interactions. In recent years, some new methods are devoted to mining the interaction between documents, such as groupwise scoring function (GSF), which learns multivariate scoring function to jointly judge the correlation, but most of these methods ignore the differences of the interaction between documents, and bring high calculation cost at the same time. In order to solve this problem, this paper proposes a weighted groupwise deep ranking model (W-GSF). In view of the deep interest network in the field of recommendation, this paper intro-duces the idea of adjusting the weight of historical behavior sequence according to the candidate products. On the basis of multivariate scoring method in learning to rank field, this method uses muti-layer feed forword neural networks as main structure, and adds an activation unit into it before the input module, taking advantage of neural networks to adjust the weight of input multiple variables adaptively, so as to mine the differences of cross document relationship. Experiments on the public benchmark dataset MSLR verify the effectiveness of the method. Compared with baseline ranking models, the introduction of activation strategy brings a significant improvement of ranking metrics, and the computational complexity is greatly reduced compared with the same effect learning to rank methods.

Key words: learning to rank (LtR), groupwise scoring function (GSF), deep neural network, deep interest network

摘要:

排序学习(LtR)将有监督机器学习技术(SML)用于解决排序问题,旨在给出输入文档列表的相关度更优化的排序结果。此前关于深度排序模型的研究,对于列表内文档的相关度计算彼此独立,缺乏考虑文档之间的相互作用。近年来一些新方法致力于挖掘文档之间的相互影响,如分组评分法(GSF),通过学习多元变量评分函数来联合判断文档相关性,但大多忽略了文档间相互影响的差异性,同时增加了很大的计算代价。针对此问题,提出了一种带权重的分组深度排序模型(W-GSF)。该方法借鉴推荐领域的深度兴趣网络,引入其根据候选商品调整历史行为序列权重的思想,在排序学习中多元评分法基础上,以多层前馈神经网络为主体结构,并在输入端加入激活单元,利用神经网络自适应学习调整输入的多元变量的权重,来挖掘交叉文档关系的差异性。在公共基准数据集MSLR上的实验验证了该方法的有效性,相比基线排序模型,激活策略的引入带来了排序指标上的明显提升,同时相对于同等效果的排序方法计算量大幅降低。

关键词: 排序学习(LtR), 分组评分法(GSF), 深度神经网络, 深度兴趣网络

CLC Number: