SFExt-PGAbs:两阶段长文档摘要模型

doi:10.3778/j.issn.1673-9418.2006002

摘要/Abstract

摘要：

针对抽取式方法、生成式方法在长文档摘要上的流畅性、准确性缺陷以及在文档编码前截断原始文档造成的重要信息缺失问题，提出一种两阶段长文档摘要模型SFExt-PGAbs，由次模函数抽取式摘要SFExt与指针生成器生成式摘要PGAbs组成。SFExt-PGAbs模拟人类对长文档进行摘要的过程，首先使用SFExt在长文档中抽取出重要句子，过滤不重要且冗余的句子形成过渡文档，然后PGAbs接收过渡文档作为输入以生成流畅且准确的摘要。为获取与原始文档中心思想更为接近的过渡文档，在传统SFExt中拓展出位置重要性、准确性两个子方面，同时设计新的贪心算法。为研究不同特征提取器对生成摘要质量的影响，在PGAbs中应用两种循环神经网络。实验结果显示，在CNNDM测试集上，SFExt-PGAbs相较于基线模型生成了更为流畅、准确的摘要，ROUGE指标有较大提升。同时，子方面拓展后的SFExt也能抽取得到更准确的摘要。

关键词: 两阶段摘要模型, 长文档摘要, 抽取式摘要, 生成式摘要, 次模函数, 指针生成器, 子方面融合

Abstract:

Aiming at the fluency problem of extractive method, the accuracy problem of abstractive method, and the important information missing problem caused by truncating the original document before document encoding, this paper proposes a two-stage long document summarization model SFExt-PGAbs. It is composed of submodular function for extractive summarization SFExt and pointer generator for abstractive summarization PGAbs. SFExt-PGAbs simulates the human process of summarizing a long document. First, SFExt is used to extract important sentences from the long document and filter the unimportant and redundant sentences to form a transitional document. Then, PGAbs receives the transitional document as input to generate a fluent and accurate summary. In order to get a transitional document that is closer to the original document-centered idea, this paper expands the two sub-aspects of positional importance and accuracy in the traditional SFExt, and designs a new greedy algorithm at the same time. In order to study the effect of different feature extractors on the quality of the generated summary, two kinds of recurrent neural networks are applied in PGAbs. The experimental results show that on the CNNDM test set, SFExt-PGAbs generates a more fluent and more accurate summary compared with the baseline model, and the ROUGE indicators are significantly improved. At the same time, the expanded sub-aspects of SFExt can extract more accurate summary.

Key words: two-stage summarization model, long document summarization, extractive summarization, abstractive summarization, submodular function, pointer generator, sub-aspect fusion

周伟枭, 蓝雯飞, 许智明, 朱容波. SFExt-PGAbs:两阶段长文档摘要模型[J]. 计算机科学与探索, 2021, 15(5): 907-921.

ZHOU Weixiao, LAN Wenfei, XU Zhiming, ZHU Rongbo. SFExt-PGAbs: Two-Stage Summarization Model for Long Document[J]. Journal of Frontiers of Computer Science and Technology, 2021, 15(5): 907-921.

参考文献

[1] EVANGELOPOULOS G, ZLATINTSI A, POTAMIANOS A, et al. Multimodal saliency and fusion for movie summarization based on aural, visual, and textual attention[J]. IEEE Trans-actions on Multimedia, 2014, 15(7): 1553-1568.
[2] JAYANTH J, SUNDARARAJ J, BHATTACHARYYA P. Mono-tone submodularity in opinion summaries[C]//Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, Lisbon, Sep 17-22, 2015. Stroudsburg: ACL, 2015: 169-178.
[3] ZHANG J J, ZHOU Y, ZONG C Q. Abstractive cross-language summarization via translation model enhanced predicate argu-ment structure fusing[J]. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2016, 24(10): 1842-1853.
[4] YOSHIDA Y, SUZUKI J, HIRAO T, et al. Dependency-based discourse parser for single-document summarization[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, Doha, Oct 25-29, 2014. Stroudsburg: ACL, 2014: 1834-1839.
[5] LIU Y, LAPATA M. Hierarchical transformers for multi-document summarization[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Florence, Jul 28-Aug 2, 2019. Stroudsburg: ACL, 2019: 5070-5081.
[6] NALLAPATI R, ZHAI F F, ZHOU B W. SummaRuNNer: a recurrent neural network based sequence model for extractive summarization of documents[C]//Proceedings of the 2017 Association for the Advancement of Artificial Intelligence, San Francisco, Feb 4-9, 2017. Menlo Park: AAAI, 2017: 3075-3081.
[7] TAN J W, WAN X J, XIAO J G. Abstractive document sum-marization with a graph-based attentional neural model[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, Vancouver, Jul 30-Aug 4, 2017. Stroudsburg: ACL, 2017: 1171-1181.
[8] HOCHREITER S, SCHMIDHUBER J. Long short-term memory[J]. Neural Computation, 1997, 9(8): 1735-1780.
[9] CHO K, MERRIENBOER B, GULCEHRE C, et al. Learning phrase representations using RNN encoder-decoder for stati-stical machine translation[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, Doha, Oct 25-29, 2014. Stroudsburg: ACL, 2014: 1724-1734.
[10] JUNG T, KANG D, MENTCH L, et al. Earlier isn’t always better: sub-aspect analysis on corpus and system biases in summarization[C]//Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, Hong Kong, China, Nov 3-7, 2019. Stroudsburg: ACL, 2019: 3324-3335.
[11] SEE A, LIU P, MANNING C. Get to the point: summarization with pointer-generator networks[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Ling-uistics, Vancouver, Jul 30-Aug 4, 2017. Stroudsburg: ACL, 2017: 1073-1083.
[12] GEHRMANN S, DENG Y T, RUSH A. Bottom-up abstractive summarization[C]//Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Oct 31-Nov 4, 2018. Stroudsburg: ACL, 2018: 4098-4109.
[13] LIN H, BILMES J. Multi-document summarization via bud-geted maximization of submodular functions[C]//Proceedings of the 2010 Conference of the North American Chapter of the Association of Computational Linguistics/Human Language Technology, Los Angeles, Jun 2-6, 2010. Minneapolis: NAACL, 2010: 912-920.
[14] LIN H, BILMES J. A class of submodular functions for document summarization[C]//Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, Portland, Jun 20-24, 2011. Stroudsburg: ACL, 2011: 510-520.
[15] TIXIER A, MELADIANOS P, VAZIRGIANNIS M. Combining graph degeneracy and submodularity for unsupervised extractive summarization[C]//Proceedings of the 2017 Workshop on New Frontiers in Summarization, Vancouver, Jul 30-Aug 4, 2017. Stroudsburg: ACL, 2017: 48-58.
[16] MIHALCEA R, TARAU R. Textrank: bringing order into texts[C]//Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing, Barcelona, Jul 25-26, 2004. Stroudsburg: ACL, 2004: 404-411.
[17] ERKAN G, RADEV D. LexPageRank: prestige in multi-document text summarization[C]//Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing, Barcelona, Jul 25-26, 2004. Stroudsburg: ACL, 2004: 365-371.
[18] SRIPADA S, JAGARLAMUDI J. Summarization approaches based on document probability distributions[C]//Proceedings of the 23rd Pacific Asia Conference on Language, Information and Computation, Hong Kong, China, Dec 3-5, 2009. Stroudsburg: ACL, 2009: 521-529.
[19] GONG Y H, LIU X. Generic text summarization using relevance measure and latent semantic analysis[C]//Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, New Orleans, Sep 9-12, 2001. New York: ACM, 2001: 19-25.
[20] TSAREV D, PETROVSKIY M, MASHECHKIN I. Using NMF-based text summarization to improve supervised and unsupervised classification[C]//Proceedings of the 11th Inter-national Conference on Hybrid Intelligent Systems, Malacca, Dec 5-8, 2011. Piscataway: IEEE, 2011: 185-189.
[21] ZHENG H, LAPATA M. Sentence centrality revisited for unsupervised summarization[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Ling-uistics, Florence, Jul 28-Aug 2, 2019. Stroudsburg: ACL, 2019: 6236-6247.
[22] DONG Y, ROMASCANU A, CHEUNG J C K. HipoRank: incorporating hierarchical and positional information into graph-based unsupervised long document extractive summari-zation[J]. arXiv:2005.00513, 2020.
[23] CHOPRA S, AULI M, RUSH A. Abstractive sentence sum-marization with attentive recurrent neural networks[C]//Pro-ceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, San Diego, Jun 12-17, 2016. Minneapolis: NAACL, 2016: 93-98.
[24] BAHDANAU D, CHO K, BENGIO Y. Neural machine trans-lation by jointly learning to align and translate[C]//Proceedings of the 3rd International Conference on Learning Represen-tations, San Diego, May 7-9, 2015.
[25] NALLAPATI R, ZHOU B W, SANTOS C. Abstractive text sum-marization using sequence-to-sequence RNNs and beyond[C]// Proceedings of the 20th SIGNLL Conference on Computational Natural Language Learning, Berlin, Aug 11-12, 2016. Stroud-sburg: ACL, 2016: 280-290.
[26] VINYALS O, FORTUNATO M, JAITLY N. Pointer networks[C]//Proceedings of the 2015 Annual Conference on Neural Information Processing Systems, Montreal, Dec 7-12, 2015. New York: Curran Associates, 2015: 2692-2700.
[27] GU J T, LU Z D, LI H. Incorporating copying mechanism in sequence-to-sequence learning[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, Berlin, Aug 7-12, 2016. Stroudsburg: ACL, 2016: 1631-1640.
[28] GULCEHRE C, AHN S, NALLAPATI R, et al. Pointing the unknown words[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, Berlin, Aug 7-12, 2016. Stroudsburg: ACL, 2016: 140-149.
[29] TU Z P, LU Z D, LIU Y, et al. Modeling coverage for neural machine translation[C]//Proceedings of the 54th Annual Meet-ing of the Association for Computational Linguistics, Berlin, Aug 7-12, 2016. Stroudsburg: ACL, 2016: 76-85.
[30] GUO H, PASUNURU R, BANSAL M, et al. Soft layer-specific multi-task summarization with entailment and question generation[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, Melbourne, Jul 15-20, 2018. Stroudsburg: ACL, 2018: 687-697.
[31] ZHU J N, WANG Q, WANG Y N, et al. NCLS: neural cross-lingual summarization[C]//Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, Hong Kong, China, Nov 3-7, 2019. Stroudsburg: ACL, 2019: 3045-3055.
[32] MISHRA A, TAMILSELVAM S, RIDDHIMAN D, et al. Cognition-cognizant sentiment analysis with multitask sub-jectivity summarization based on annotators?? gaze behavior[C]//Proceedings of the 32nd AAAI Conference on Artificial Intelligence, New Orleans, Feb 2-7, 2018. Menlo Park: AAAI, 2018.
[33] ZHU J N, LI H R, LIU T S, et al. MSMO: multimodal sum-marization with multimodal output[C]//Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Oct 31-Nov 4, 2018. Stroudsburg: ACL, 2018: 4154-4164.
[34] ZHU J N, ZHOU Y, ZHANG J J, et al. Multimodal sum-marization with guidance of multimodal reference[C]//Pro-ceedings of the 34th AAAI Conference on Artificial Intelligence, New York, Feb 7-12, 2020. Menlo Park: AAAI, 2020: 63-70.
[35] CELIKYILMAZ A, BOSSELUT A, HE X D, et al. Deep communicating agents for abstractive summarization[C]//Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, New Orleans, Jun 1-6, 2018. Minneapolis: NAACL, 2018: 1662-1675.
[36] VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]//Proceedings of the 2017 Annual Conference on Neural Information Processing Systems, Long Beach, Dec 4-9, 2017. New York: Curran Associates, 2017: 5998-6008.
[37] LI H R, ZHU J N, ZHANG J J, et al. Keywords-guided abstractive sentence summarization[C]//Proceedings of the 34th AAAI Conference on Artificial Intelligence, New York, Feb 7-12, 2020. Menlo Park: AAAI, 2020.
[38] DUCHI J, HAZAN E, SINGER Y. Adaptive subgradient methods for online learning and stochastic optimization[J]. Journal of Machine Learning Research, 2011, 12(7): 2121-2159.
[39] HERMANN M, KOCISKY T, GREFENSTETTE E. Teaching machines to read and comprehend[C]//Proceedings of the 2015 Annual Conference on Neural Information Processing Systems, Montreal, Dec 7-12, 2015. New York: Curran Associates, 2015: 1693-1701.
[40] LIN C. Rouge: a package for automatic evaluation of sum-maries[C]//Proceedings of the 2004 Workshop on Text Sum-marization Branches Out, Barcelona, Jul 21-26, 2004. Strou-dsburg: ACL, 2004: 74-81.
[41] PAULUS R, XIONG C M, SOCHER R. A deep reinforced model for abstractive summarization[C]//Proceedings of the 6th International Conference on Learning Representations, Vancouver, Apr 30-May 3, 2018.
[42] LI C L, XU W R, LI S, et al. Guiding generation for abstra-ctive text summarization based on key information guide network[C]//Proceedings of the 2018 Conference of the North American Chapter of the Association for Computa-tional Linguistics: Human Language Technologies, New Orleans, Jun 1-6, 2018. Minneapolis: NAACL, 2018: 55-60.
[43] ZHOU W X, LAN W F. Summarization model with multi-task learning based on fused text classification[J/OL]. Com-puter Engineering [2020-05-21]. https://doi.org/10.19678/j.issn.1000-3428.0057448.
周伟枭, 蓝雯飞. 融合文本分类的多任务学习摘要模型[J/OL]. 计算机工程 [2020-05-21]. https://doi.org/10.19678/j.issn.1000-3428.0057448.