计算机科学与探索 ›› 2021, Vol. 15 ›› Issue (8): 1476-1489.DOI: 10.3778/j.issn.1673-9418.2005023

• 人工智能 • 上一篇    下一篇

利用收益预测与策略梯度两阶段众包评论集成

荣欢,马廷淮   

  1. 1. 南京信息工程大学 人工智能学院,南京 210044
    2. 南京信息工程大学 计算机与软件学院,南京 210044
  • 出版日期:2021-08-01 发布日期:2021-08-02

Two-Phase Crowdsourced Comment Integration Method Based on Reward Prediction and Policy Gradient

RONG Huan, MA Tinghuai   

  1. 1. School of Artificial Intelligence, Nanjing University of Information Science & Technology, Nanjing 210044, China
    2. School of Computer & Software, Nanjing University of Information Science & Technology, Nanjing 210044, China
  • Online:2021-08-01 Published:2021-08-02

摘要:

近年来随着互联网的飞速发展,人们频繁地在网络上发布关于某一特定对象的评论内容,快速掌握众包评论文本的关键信息对决策制定、服务调整有着重要作用,对众包评论文本集成进行深入研究亦显得十分必要。众包评论文本集成旨在将不同评论者对同一对象的评论内容以既定压缩率整合成较短的集成文本,从而根据大众认知形成关于特定对象较为匹配的内容描述。针对该问题提出了一种利用收益预测与策略梯度的两阶段众包评论集成方法。该方法不依赖于任何人工真值,仅提供源众包评论文档,由代理根据收益经验自行抽取关键语句形成众包评论集成文档。具体而言,第一阶段以语句相关性与冗余性衡量集成文档内容质量,以此作为收益,利用Q-值学习预测出从当前语句选择起直至评论集成结束时所产生的长期收益,由此指导代理学习最优语句选择策略;在此基础上,第二阶段以集成文档情感强度为收益,利用策略梯度(上升)进一步调整第一阶段代理习得的语句选择策略,使得代理所产生集成文本在具备一定内容质量同时,从客观角度突显文本情感强度,更明确反映出评论者所持有的情感态度。实验结果表明,与现有相关方法相比,所提出方法在评论文本集成内容质量与情感强度方面总体取得最优,且产生集成文档所耗费时长仍控制在可接受范围之内。

关键词: 众包数据集成, 真值推测, 深度学习, 人工智能

Abstract:

In recent years, with the rapid development of the Internet, people frequently post comments about a specific object on the Internet. Mastering the critical information from the crowdsourced comments promptly is crucial to the decision-making and service adjustment, with non-negligible application value. Therefore, it is imperative to devote effort to the research on crowdsourced comment integration problem. The goal of the crowdsourced comment integration is to integrate different users?? comments on the target object into a shorter integrated document by a given compression rate, so as to form a comparatively matched description of the target object according to the public cognition. To solve such problem, a two-phase crowdsourced comment integration method based on reward prediction and policy gradient is proposed. The proposed method does not rely on any man-made ground truth, only requiring the crowdsourced comments. Then, an agent, guided by the experience or reward, will extract key sentence from the crowdsourced comments to generate the integrated comment. Specifically, in the first phase, measuring the content quality of the integrated comment by the relevance and redundancy of sentences, taking the content quality as reward, the long-term reward from selecting a current sentence to the end of the whole comment integration process will be predicted by Q-value, based on which the agent is guided to learn an optimal sentence selection policy. Then, in the second phase, taking the sentiment intensity of the integrated comment as reward, the sentence selection policy learnt in the first phase will be further adjusted by policy gradient, so that the integrated comment generated by the agent can highlight the sentiment intensity from an objective perspective and reflect users?? attitude more obviously, at the same time, maintaining the content quality. According to the experimental results, compared with the other existing methods, the proposed method can achieve the best overall performance in terms of the content quality as well as the sentiment intensity of the integrated comment, and the time consumed for generation is still controlled at an acceptable level.

Key words: crowdsourced data integration, truth inference, deep learning, artificial intelligence