Two-Phase Crowdsourced Comment Integration Method Based on Reward Prediction and Policy Gradient

doi:10.3778/j.issn.1673-9418.2005023

Abstract

Abstract:

In recent years, with the rapid development of the Internet, people frequently post comments about a specific object on the Internet. Mastering the critical information from the crowdsourced comments promptly is crucial to the decision-making and service adjustment, with non-negligible application value. Therefore, it is imperative to devote effort to the research on crowdsourced comment integration problem. The goal of the crowdsourced comment integration is to integrate different users?? comments on the target object into a shorter integrated document by a given compression rate, so as to form a comparatively matched description of the target object according to the public cognition. To solve such problem, a two-phase crowdsourced comment integration method based on reward prediction and policy gradient is proposed. The proposed method does not rely on any man-made ground truth, only requiring the crowdsourced comments. Then, an agent, guided by the experience or reward, will extract key sentence from the crowdsourced comments to generate the integrated comment. Specifically, in the first phase, measuring the content quality of the integrated comment by the relevance and redundancy of sentences, taking the content quality as reward, the long-term reward from selecting a current sentence to the end of the whole comment integration process will be predicted by Q-value, based on which the agent is guided to learn an optimal sentence selection policy. Then, in the second phase, taking the sentiment intensity of the integrated comment as reward, the sentence selection policy learnt in the first phase will be further adjusted by policy gradient, so that the integrated comment generated by the agent can highlight the sentiment intensity from an objective perspective and reflect users?? attitude more obviously, at the same time, maintaining the content quality. According to the experimental results, compared with the other existing methods, the proposed method can achieve the best overall performance in terms of the content quality as well as the sentiment intensity of the integrated comment, and the time consumed for generation is still controlled at an acceptable level.

Key words: crowdsourced data integration, truth inference, deep learning, artificial intelligence

摘要：

近年来随着互联网的飞速发展，人们频繁地在网络上发布关于某一特定对象的评论内容，快速掌握众包评论文本的关键信息对决策制定、服务调整有着重要作用，对众包评论文本集成进行深入研究亦显得十分必要。众包评论文本集成旨在将不同评论者对同一对象的评论内容以既定压缩率整合成较短的集成文本，从而根据大众认知形成关于特定对象较为匹配的内容描述。针对该问题提出了一种利用收益预测与策略梯度的两阶段众包评论集成方法。该方法不依赖于任何人工真值，仅提供源众包评论文档，由代理根据收益经验自行抽取关键语句形成众包评论集成文档。具体而言，第一阶段以语句相关性与冗余性衡量集成文档内容质量，以此作为收益，利用Q-值学习预测出从当前语句选择起直至评论集成结束时所产生的长期收益，由此指导代理学习最优语句选择策略；在此基础上，第二阶段以集成文档情感强度为收益，利用策略梯度（上升）进一步调整第一阶段代理习得的语句选择策略，使得代理所产生集成文本在具备一定内容质量同时，从客观角度突显文本情感强度，更明确反映出评论者所持有的情感态度。实验结果表明，与现有相关方法相比，所提出方法在评论文本集成内容质量与情感强度方面总体取得最优，且产生集成文档所耗费时长仍控制在可接受范围之内。

关键词: 众包数据集成, 真值推测, 深度学习, 人工智能

RONG Huan, MA Tinghuai. Two-Phase Crowdsourced Comment Integration Method Based on Reward Prediction and Policy Gradient[J]. Journal of Frontiers of Computer Science and Technology, 2021, 15(8): 1476-1489.

荣欢, 马廷淮. 利用收益预测与策略梯度两阶段众包评论集成[J]. 计算机科学与探索, 2021, 15(8): 1476-1489.

References

[1] ZHAO W, LIN Y M, HUANG T Y, et al. User opinion extraction based on adaptive crowd labeling with cost constrain[J]. Journal of Computer Applications, 2019, 39(5): 1351-1356.
赵威, 林煜明, 黄涛贻, 等. 成本约束下自适应众包标注的用户观点抽取[J]. 计算机应用, 2019, 39(5): 1351-1356.
[2] ZHANG J, WU X, SHENG V S. Learning from crowdsourced labeled data: a survey[J]. Artificial Intelligence Review, 2016, 46(4): 543-576.
[3] SHENG V S, ZHANG J. Machine learning with crowdsourcing: a brief summary of the past research and future directions[C]//Proceedings of the 33rd AAAI Conference on Artificial Intelligence, the 31st Innovative Applications of Artificial Intelligence Conference, the 9th AAAI Symposium on Educational Advances in Artificial Intelligence, Honolulu, Jan 27-Feb 1, 2019. Menlo Park: AAAI, 2019: 9837-9843.
[4] REZAEI A, DAMI S, DANESHJOO P. Multi-document extractive text summarization via deep learning approach[C]//Proceedings of the 2019 5th Conference on Knowledge Based Engineering and Innovation, Tehran, 2019. Piscataway: IEEE, 2019: 680-685.
[5] FERREIRA R, CABRAL L D S, LINS R D, et al. Assessing sentence scoring techniques for extractive text summarization[J]. Expert Systems with Applications, 2013, 40(14): 5755-5764.
[6] MOIRANGTHEM D S, LEE M. Abstractive summarization of long texts by representing multiple compositionalities with temporal hierarchical pointer generator network[J]. Neural Networks, 2020, 124: 1-11.
[7] LIANG Z Y, DU J P, LI C Y. Abstractive social media text summarization using selective reinforced Seq2Seq attention model[J]. Neurocomputing, 2020, 410: 432-440.
[8] GUPTA S, GUPTA S K. Abstractive summarization: an overview of the state of the art[J]. Expert Systems with Applications, 2019, 121(5): 49-65.
[9] ZHANG J, WU M, SHENG V S. Ensemble learning from crowds[J]. IEEE Transactions on Knowledge & Data Engineering, 2019, 31(8): 1506-1519.
[10] FUAD T A, NAYEEM M T, MAHMUD A, et al. Neural sentence fusion for diversity driven abstractive multi-document summarization[J]. Computer Speech & Language, 2019, 58(11): 216-230.
[11] ZHANG Z Q. Single-document summarization based on semantics[J]. Journal of Computer Applications, 2010, 30(6): 1673-1675.
章芝青. 基于语义的单文档自动摘要算法[J]. 计算机应用, 2010, 30(6): 1673-1675.
[12] TESAURO G. Practical issues in temporal difference learning[J]. Machine Learning, 1992, 8: 257-277.
[13] WATKINS C J C H, DAYAN P. Q-learning[J]. Machine Learning, 1992, 8: 279-292.
[14] SUTTON R S, MCALLESTER D A, SINGH S P, et al. Policy gradient methods for reinforcement learning with function approximation[C]//Proceedings of the Advances in Neural Information Processing Systems, Denver, Nov 29-Dec 4, 1999. Cambridge: MIT Press, 2000: 1057-1063.
[15] VAN HASSELT H, GUEZ A, SILVER D. Deep reinforcement learning with double Q-learning[C]//Proceedings of the 30th AAAI Conference on Artificial Intelligence, Phoenix, Feb 12-17, 2016. Menlo Park: AAAI, 2016: 2094-2100.
[16] GUO L, LI B C, ZHAO J L. Topical word embedding clustering based new event detection within topics[J]. Journal of Chinese Information Processing, 2019, 33(6): 64-71.
郭磊, 李弼程, 赵军磊. 基于主题词向量聚类的话题内新事件检测[J]. 中文信息学报, 2019, 33(6): 64-71.
[17] MA T H, RONG H, HAO Y S, et al. A novel sentiment polarity detection framework for Chinese[J]. IEEE Transactions on Affective Computing, 2019.
[18] RYANG S, ABEKAWA T. Framework of automatic text summarization using reinforcement learning[C]//Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, Jeju Island, Jul 12-14, 2012. Stroudsburg: ACL, 2012: 256-265.
[19] RIOUX C, HASAN S A, CHALI Y. Fear the reaper: a system for automatic multi-document summarization with reinforcement learning[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, Doha, Oct 25-29, 2014. Stroudsburg: ACL, 2014: 681-690.
[20] LI H R, ZHU J N, MA C, et al. Read, watch, listen, and summarize: multi-modal summarization for asynchronous text, image, audio and video[J]. IEEE Transactions on Knowledge and Data Engineering, 2018, 31(5): 996-1009.
[21] ABDI A, SHAMSUDDIN S M, HASAN S, et al. Machine learning-based multi-documents sentiment-oriented summarization using linguistic treatment[J]. Expert Systems with Applications, 2018, 109: 66-85.
[22] NALLAPATI R, ZHAI F F, ZHOU B W. Summarunner: a recurrent neural network based sequence model for extractive summarization of documents[C]//Proceedings of the 31st AAAI Conference on Artificial Intelligence, San Francisco, Feb 4-9, 2017. Menlo Park: AAAI, 2017: 3075-3081.
[23] YAO K C, ZHANG L B, LUO T J, et al. Deep reinforcement learning for extractive document summarization[J]. Neurocomputing, 2018, 284: 52-62.
[24] BOYAN J A. Technical update: least-squares temporal difference learning[J]. Machine Learning, 2002, 49(2): 233-246.
[25] PANG C, YIN C H. Chinese text summarization based on classification[J]. Computer Science, 2018, 45(1): 144-147.
庞超, 尹传环. 基于分类的中文文本摘要方法[J]. 计算机科学, 2018, 45(1): 144-147.