基于深度学习的抽取式摘要研究综述

doi:10.3778/j.issn.1673-9418.2308100

摘要/Abstract

摘要： 自动文本摘要（ATS）是自然语言处理的热门研究方向，主要实现方法分为抽取式和生成式两类。抽取式摘要直接采用源文档中的文字内容，相比生成式摘要具有更高的语法正确性和事实正确性，在政策解读、官方文件总结、法律和医药等要求较为严谨的领域具有广泛应用前景。目前基于深度学习的抽取式摘要研究受到广泛关注。主要梳理了近几年基于深度学习的抽取式摘要技术研究进展；针对抽取式摘要的两个关键步骤——文本单元编码和摘要抽取，分别分析了相关研究工作。根据模型框架的不同，将文本单元编码方法分为层级序列编码、基于图神经网络的编码、融合式编码和基于预训练的编码四类进行介绍；根据摘要抽取阶段抽取粒度的不同，将摘要抽取方法分为文本单元级抽取和摘要级抽取两类进行分析。介绍了抽取式摘要任务常用的公共数据集和性能评估指标。预测并分析总结了该领域未来可能的研究方向及相应的发展趋势。

关键词: 深度学习, 抽取式摘要, 文本单元编码, 摘要抽取

Abstract: Automatic text summarization (ATS) is a popular research direction in natural language processing, and its main implementation methods are divided into two categories: extractive and abstractive. Extractive summarization directly uses the text content in the source document, and compared with abstractive summarization, it has higher grammatical and factual correctness, and has broad prospects for extractive summarization in domains such as policy interpretation, official document summarization, legal and medicine industry, etc. In recent years, extractive summarization based on deep learning has received extensive attention. This paper mainly reviews the research progress of extractive summarization technology based on deep learning in recent years, and analyzes the relevant research work for the two key steps of extractive summarization: text unit encoding and summary extraction. Firstly, according to the different model frameworks, text unit encoding methods are divided into four categories: hierarchical sequential encoding, encoding based on graph neural networks, fusion encoding, and pre-training-based encoding. Then, according to the different granularity of summary extraction in the summary extraction stage, summary extraction methods are divided into two categories: text unit-level extraction and summary-level extraction. This paper also introduces commonly used public datasets and performance evaluation indicators for extractive summarization tasks. Finally, the future possible research directions and corresponding development trends in this field are predicted and summarized.

Key words: deep learning, extractive summarization, text unit encoding, summary extraction

田萱, 李嘉梁, 孟晓欢. 基于深度学习的抽取式摘要研究综述[J]. 计算机科学与探索, 2024, 18(11): 2823-2847.

TIAN Xuan, LI Jialiang, MENG Xiaohuan. Survey of Deep Learning Based Extractive Summarization[J]. Journal of Frontiers of Computer Science and Technology, 2024, 18(11): 2823-2847.

参考文献

[1] CHENG J, LAPATA M. Neural summarization by extracting sentences and words[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, Berlin, Aug 7-12, 2016. Stroudsburg: ACL, 2016: 484-494.
[2] KUMAR N, REDDY M. Factual instance tweet summarization and opinion analysis of sport competition[C]//Proceedings of the 2018 International Conference on Soft Computing and Signal Processing. Singapore: Springer, 2019, 2: 153-162.
[3] COHAN A, DERNONCOURT F, KIM D S, et al. A discourse-aware attention model for abstractive summarization of long documents[C]//Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, New Orleans, Jun 1-6, 2018. Stroudsburg: ACL, 2019: 615-621.
[4] AGARWAL A, XU S, GRABMAIR M. Extractive summarization of legal decisions using multi-task learning and maximal marginal relevance[C]//Findings of the Association for Computational Linguistics: EMNLP 2022, Abu Dhabi, Dec 7-11, 2022. Stroudsburg: ACL, 2022: 1857-1872.
[5] ZHUANG Y, LU Y, WANG S. Weakly supervised extractive summarization with attention[C]//Proceedings of the 22nd Annual Meeting of the Special Interest Group on Discourse and Dialogue, Singapore, Jul 29-31, 2021. Stroudsburg: ACL, 2021: 520-529.
[6] 侯丽微, 胡珀, 曹雯琳. 主题关键词信息融合的中文生成式自动摘要研究[J]. 自动化学报, 2019, 45(3): 530-539.
HOU L W, HU P, CAO W L. Automatic Chinese abstractive summarization with topical keywords fusion[J]. Acta Automatica Sinica, 2019, 45(3): 530-539.
[7] 石磊, 阮选敏, 魏瑞斌，等. 基于序列到序列模型的生成式文本摘要研究综述[J]. 情报学报, 2019, 38(10): 1102-1116.
SHI L, RUAN X M, WEI R B, et al. Abstractive summarization based on sequence to sequence models: a review[J]. Journal of the China Society for Scientific and Technical Information, 2019, 38(10): 1102-1116.
[8] 李金鹏, 张闯, 陈小军, 等. 自动文本摘要研究综述[J]. 计算机研究与发展, 2021, 58(1): 1-21.
LI J P, ZHANG C, CHEN X J，et al. Survey on automatic text summarization[J]. Journal of Computer Research and Development, 2021, 58(1): 1-21.
[9] HOU S L, HUANG X K, FEI C, et al. A survey of text summarization approaches based on deep learning[J]. Journal of Computer Science and Technology, 2021, 36(3): 633-663.
[10] YADAV A K, RANVIJAY, YADAV R S, et al. State-of-the-art approach to extractive text summarization: a comprehensive review[J]. Multimedia Tools and Applications, 2023, 82(19): 29135-29197.
[11] EDMUNDSON H P. New methods in automatic extracting[J]. Journal of the ACM, 1969, 16(2): 264-285.
[12] ERKAN G, RADEV D R. LexRank: graph-based lexical centrality as salience in text summarization[J]. Journal of Artificial Intelligence Research, 2004, 22: 457-479.
[13] GONG Y, LIU X. Generic text summarization using relevance measure and latent semantic analysis[C]//Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, New Orleans, Sep 9-12, 2001. New York: ACM, 2001: 19-25.
[14] YIN W, PEI Y. Optimizing sentence modeling and selection for document summarization[C]//Proceedings of the 24th International Joint Conference on Artificial Intelligence, Buenos Aires, Jul 25-31, 2015. Menlo Park: AAAI, 2015: 1383-1389.
[15] CAO Z, WEI F, DONG L，et al. Ranking with recursive neural networks and its application to multi-document summarization[C]//Proceedings of the 29th AAAI Conference on Artificial Intelligence, Austin, 2015. Menlo Park: AAAI, 2015: 2153-2159.
[16] MIKOLOV T, SUTSKEVER I, CHEN K，et al. Distributed representations of words and phrases and their compositionality[C]//Advances in Neural Information Processing Systems 26: Annual Conference on Neural Information Processing Systems 2013, Lake Tahoe, 2013: 3111-3119.
[17] PENNINGTON J, SOCHER R, MANNING C D. GloVe: global vectors for word representation[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, Doha, Oct 25-29, 2014. Stroudsburg: ACL, 2014: 1532-1543.
[18] ZHOU Q, YANG N, WEI F，et al. Neural document summarization by jointly learning to score and select sentences[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, Melbourne, Jul 15-20, 2018. Stroudsburg: ACL, 2018: 654-663.
[19] NALLAPATI R, ZHAI F, ZHOU B. SummaRuNNer: a recurrent neural network based sequence model for extractive summarization of documents[C]//Proceedings of the 31st AAAI Conference on Artificial Intelligence, San Francisco, Feb 4-9, 2017. Menlo Park: AAAI, 2017: 3075-3081.
[20] LUO L, AO X, SONG Y，et al. Reading like her: human reading inspired extractive summarization[C]//Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing, Hong Kong, China, Nov 3-7, 2019. Stroudsburg: ACL, 2019: 3031-3041.
[21] FENG C, CAI F, CHEN H，et al. Attentive encoder-based extractive text summarization[C]//Proceedings of the 27th ACM International Conference on Information and Knowledge Management, Torino, Oct 22-26, 2018. New York: ACM, 2018: 1499-1502.
[22] JIN H, WANG T, WAN X. Multi-granularity interaction network for extractive and abstractive multi-document summarization[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Jul 5-10, 2020. Stroudsburg: ACL, 2020: 6244-6254.
[23] DIAO Y, LIN H, YANG L, et al. CRHASum: extractive text summarization with contextualized-representation hierarchical-attention summarization network[J]. Neural Computing & Applications: 2020, 32(15): 11491-11503.
[24] CAO Z, WEI F, LI S, et al. Learning summary prior representation for extractive summarization[C]//Proceedings of the 29th AAAI Conference on Artificial Intelligence, Beijing, Jul 26-31, 2015. Menlo Park: AAAI, 2015: 2153-2159.
[25] CHEN X, GAO S, TAO C，et al. Iterative document representation learning towards summarization with polishing[C]//Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Oct 31-Nov 4, 2018. Stroudsburg: ACL, 2018: 4088-4097.
[26] SINGH A, GUPTA M, VARMA V. Hybrid MemNet for extractive summarization[C]//Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, Singapore, Nov 6-10, 2017. New York, USA: ACM, 2017: 2303-2306.
[27] SUKHBAATAR S, SZLAM A, WESTON J，et al. End-to-end memory networks[C]//Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, Montreal, 2015: 2440-2448.
[28] KIPF T N, WELLING M. Semi-supervised classification with graph convolutional networks[C]//Proceedings of the 5th International Conference on Learning Representations, Toulon, Apr 24-26, 2017.
[29] VELICKOVIC P, CUCURULL G, CASANOVA A, et al. Graph attention networks[C]//Proceedings of the 6th International Conference on Learning Representations, Vancouver, Apr 30-May 3, 2018.
[30] ANTOGNINI D, FALTINGS B. Learning to create sentence semantic relation graphs for multi-document summarization[C]//Proceedings of the 2nd Workshop on New Frontiers in Summarization, Hong Kong, China, 2019. Stroudsburg: ACL, 2019: 32-41.
[31] YASUNAGA M, KASAI J, ZHANG R，et al. ScisummNet: a large annotated corpus and content-impact models for scientific paper summarization with citation networks[C]//Proceedings of the 33rd AAAI Conference on Artificial Intelligence, Honolulu, Jan 27-Feb 1, 2019. Menlo Park: AAAI, 2019: 7386-7393.
[32] JAIDKA K, CHANDRASEKARAN M K, RUSTAGI S, et al. Overview of the CL-SciSumm 2016 shared task[C]//Proceedings of the Joint Workshop on Bibliometric-Enhanced Information Retrieval and Natural Language Processing for Digital Libraries co-located with the Joint Conference on Digital Libraries 2016, Newark, Jun 23, 2016: 93-102.
[33] WANG D, LIU P, ZHENG Y, et al. Heterogeneous graph neural networks for extractive document summarization[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Jul 5-10, 2020. Stroudsburg: ACL, 2020: 6209-6219.
[34] MAO Q, ZHU H, LIU J，et al. MuchSUM: multi-channel graph neural network for extractive summarization[C]//Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, Madrid, Jul 11-15, 2022. New York: ACM, 2022: 2617-2622.
[35] CHRISTENSEN J, MAUSAM, SODERLAND S, et al. Towards coherent multi-document summarization[C]//Proceedings of the 2013 Conference of the North American Chapter of the Association of Computational Linguistics: Human Language Technologies, Atlanta, Jun 9-14, 2013. Stroudsburg: ACL, 2013: 1163-1173.
[36] YASUNAGA M, ZHANG R, MEELU K，et al. Graph-based neural multi-document summarization[C]//Proceedings of the 21st Conference on Computational Natural Language Learning, Vancouver, Aug 3-4, 2017. Stroudsburg: ACL, 2017: 452-462.
[37] XU J, GAN Z, CHENG Y，et al. Discourse-aware neural extractive text summarization[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Jul 5-10, 2020. Stroudsburg: ACL, 2020: 5021-5031.
[38] MANN W, THOMPSON S. Rhetorical structure theory: toward a functional theory of text organization[J]. Text-Interdiscip-linary Journal for the Study of Discourse, 1988, 8(3): 243-281.
[39] JIA R, CAO Y, TANG H，et al. Neural extractive summarization with hierarchical attentive heterogeneous graph network[C]//Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, Nov 16-20, 2020. Stroudsburg: ACL, 2020: 3622-3631.
[40] LIU Y, ZHANG J, WAN Y，et al. HETFORMER: heterogeneous transformer with sparse attention for long-text extractive summarization[C]//Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, Punta Cana, Nov 7-11, 2021. Stroudsburg: ACL, 2021: 146-154.
[41] JING B, YOU Z, YANG T, et al. Multiplex graph neural network for extractive text summarization[C]//Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, Punta Cana, Nov 7-11, 2021. Stroudsburg: ACL, 2021: 133-139.
[42] MAO Q, ZHAO S, LI J, et al. Bipartite graph pre-training for unsupervised extractive summarization with graph convolutional auto-encoders[C]//Findings of the Association for Computational Linguistics: EMNLP 2023, Singapore, Dec 6-10, 2023. Stroudsburg: ACL, 2023: 4929-4941.
[43] KWON J, KOBAYASHI N, KAMIGAITO H, et al. Considering nested tree structure in sentence extractive summarization with pre-trained transformer[C]//Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, Punta Cana, Nov 7-11, 2021. Stroudsburg: ACL, 2021: 4039-4044.
[44] GUAN Y, GUO S, LI R，et al. Frame semantic-enhanced sentence modeling for sentence-level extractive text summarization[C]//Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, Punta Cana, Nov 7-11, 2021. Stroudsburg: ACL, 2021: 4045-4052.
[45] BAKER C F, FILLMORE C J, LOWE J B. The Berkeley framenet project[C]//Proceedings of the 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics, Montreal, 1998. Stroudsburg: ACL, 1998: 86-90.
[46] DEVLIN J, CHANG M W, LEE K, et al. BERT: pre-training of deep bidirectional transformers for language understanding[C]//Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Minneapolis, Jun 2-7, 2019. Stroudsburg: ACL, 2019: 4171-4186.
[47] ZHANG X X, WEI F R, ZHOU M. HIBERT: document level pre-training of hierarchical bidirectional transformers for document summarization[C]//Proceedings of the 57th Conference of the Association for Computational Linguistics, Florence, Jul 28-Aug 2, 2019. Stroudsburg: ACL, 2019: 5059-5069.
[48] WANG H, WANG X, XIONG W, et al. Self-supervised learning for contextualized extractive summarization[C]//Proceedings of the 57th Conference of the Association for Computational Linguistics, Florence, Jul 28-Aug 2, 2019. Stroudsburg: ACL, 2019: 2221-2227.
[49] XU S, ZHANG X, WU Y, et al. Unsupervised extractive summarization by pre-training hierarchical transformers[C]//Findings of the Association for Computational Linguistics: EMNLP 2020. Stroudsburg: ACL, 2020: 1784-1795.
[50] JIA R, ZHANG X, CAO Y，et al. Neural label search for zero-shot multi-lingual extractive summarization[C]//Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics, Dublin, May 22-27, 2022. Stroudsburg: ACL, 2022: 561-570.
[51] CONNEAU A, KHANDELWAL K, GOYAL N，et al. Unsupervised cross-lingual representation learning at scale[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Jul 5-10, 2020. Stroudsburg: ACL, 2020: 8440-8451.
[52] SCIALOM T, DRAY P A, LAMPRIER S，et al. MLSUM: the multilingual summarization corpus[C]//Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, Nov 16-20, 2020. Stroudsburg: ACL, 2020: 8051-8067.
[53] LADHAK F, DURMUS E, CARDIE C，et al. WikiLingua: a new benchmark dataset for cross-lingual abstractive summarization[C]//Findings of the Association for Computational Linguistics: EMNLP 2020. Stroudsburg: ACL, 2020: 4034-4048.
[54] LIU Y, LAPATA M. Text summarization with pretrained encoders[C]//Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing, Hong Kong, China, Nov 3-7, 2019. Stroudsburg: ACL, 2019: 3728-3738.
[55] ZHENG H, LAPATA M. Sentence centrality revisited for unsupervised summarization[C]//Proceedings of the 57th Conference of the Association for Computational Linguistics, Florence, Jul 28-Aug 2, 2019. Stroudsburg: ACL, 2019: 6236-6247.
[56] SINGH A K, GUPTA M, VARMA V. Unity in diversity: Learning distributed heterogeneous sentence representation for extractive summarization[C]//Proceedings of the 32nd AAAI Conference on Artificial Intelligence, New Orleans, Feb 2-7, 2018. Menlo Park: AAAI, 2018: 5473-5480.
[57] CHO S, LI C, YU D，et al. Multi-document summarization with determinantal point processes and contextualized representations[C]//Proceedings of the 2nd Workshop on New Frontiers in Summarization, Hong Kong, China, 2019. Stroudsburg: ACL, 2019: 98-103.
[58] SHARMA E, HUANG L, HU Z, et al. An entity-driven framework for abstractive summarization[C]//Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing, Hong Kong, China, Nov 3-7, 2019. Stroudsburg: ACL, 2019: 3278-3289.
[59] JOSHI A, FIDALGO E, ALEGRE E, et al. DeepSumm: exploiting topic models and sequence to sequence networks for extractive text summarization[J]. Expert Systems with Applications, 2023, 211: 118442.
[60] BLEI D M, NG A Y, JORDAN M I. Latent Dirichlet allocation[J]. Journal of Machine Learning Research, 2003, 3:993-1022.
[61] ZHENG X, SUN A, LI J, et al. Subtopic-driven multi-document summarization[C]//Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing, Hong Kong, China, Nov 3-7, 2019. Stroudsburg: ACL, 2019: 3151-3160.
[62] NARAYAN S, PAPASARANTOPOULOS N, LAPATA M, et al. Neural extractive summarization with side information[EB/OL]. [2023-07-12]. https://arXiv.1704.04530.
[63] ABDI A, HASAN S, SHAMSUDDIN S M，et al. A hybrid deep learning architecture for opinion-oriented multi-document summarization based on multi-feature fusion[J]. Knowledge-Based Systems, 2021, 213: 106658.
[64] YANG Z, DAI Z, YANG Y, et al. XLNet: generalized auto-regressive pretraining for language understanding[C]//Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, Vancouver, Dec 8-14, 2019: 5754-5764.
[65] JADHAV A, RAJAN V. Extractive summarization with swap-net: sentences and words from alternating pointer networks[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, Melbourne, Jul 15-20, 2018. Stroudsburg: ACL, 2018: 142-151.
[66] ZHU T, HUA W, QU J, et al. Auto-regressive extractive summarization with replacement[J]. World Wide Web, 2023, 26(4): 2003-2026.
[67] PAULUS R, XIONG C, SOCHER R. A deep reinforced model for abstractive summarization[C]//Proceedings of the 6th International Conference on Learning Representations Vancouver, Apr 30-May 3, 2018.
[68] LIU Z, SHI K, CHEN N F. Conditional neural generation using sub-aspect functions for extractive news summarization[C]//Findings of the Association for Computational Linguistics: EMNLP 2020. Stroudsburg: ACL, 2020: 1453-1463.
[69] KEDZIE C, MCKEOWN K, III H. Content selection in deep learning models of summarization[C]//Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Oct 31-Nov 4, 2018. Stroudsburg: ACL, 2018: 1818-1828.
[70] ISONUMA M, FUJINO T, MORI J, et al. Extractive summarization using multi-task learning with document classification[C]//Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. Stroudsburg: ACL, 2017: 2101-2110.
[71] MACHIDA K, ISHIGAKI T, KOBAYASHI H, et al. Semi-supervised extractive question summarization using question-answer pairs[C]//Proceedings of the 42nd European Conference on IR Research, Lisbon, Apr 14-17, 2020. Cham: Springer, 2020: 255-264.
[72] CAO Z, LI W, LI S, et al. Improving multi-document summarization via text classification[C]//Proceedings of the 31st AAAI Conference on Artificial Intelligence, San Francisco, Feb 4-9, 2017. Menlo Park: AAAI, 2017: 3053-3059
[73] NARAYAN S, COHEN S B, LAPATA M. Ranking sentences for extractive summarization with reinforcement learning[C]//Proceedings of the 2018 Conference of the North American Chapter of the ACL: Human Language Technologies, New Orleans, Jun 1-6, 2018. Stroudsburg: ACL, 2018: 1747-1759.
[74] ZHANG X, LAPATA M, WEI F, et al. Neural latent extractive document summarization[C]//Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Oct 31-Nov 4, 2018. Stroudsburg: ACL, 2018: 779-784.
[75] DONG Y, SHEN Y, CRAWFORD E, et al. BanditSum: extractive summarization as a contextual bandit[C]//Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Oct 31-Nov 4, 2018. Stroudsburg: ACL, 2018: 3739-3748.
[76] WU Y, HU B. Learning to extract coherent summary via deep reinforcement learning[C]//Proceedings of the 32nd AAAI Conference on Artificial Intelligence, New Orleans, Feb 2-7, 2018. Menlo Park: AAAI, 2018: 5602-5609.
[77] ARUMAE K, LIU F. Reinforced extractive summarization with question-focused rewards[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, Melbourne, Jul 15-20, 2018. Stroudsburg: ACL, 2018: 105-111.
[78] SHI J, LIANG C, HOU L, et al. DeepChannel: salience estimation by contrastive learning for extractive document summarization[C]//Proceedings of the 33rd AAAI Conference on Artificial Intelligence, Honolulu, Jan 27-Feb 1, 2019. Menlo Park: AAAI, 2019: 6999-7006.
[79] ZHONG M, LIU P, CHEN Y, et al. Extractive summarization as text matching[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Jul 5-10, 2020. Stroudsburg: ACL, 2020: 6197-6208.
[80] GONG S, ZHENFANG Z, QI J, et al. SeburSum: a novel set-based summary ranking strategy for summary-level extractive summarization[J]. The Journal of Supercomputing, 2023, 79(12): 12949-12977.
[81] XU J, DURRETT G. Neural extractive text summarization with syntactic compression[C]//Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing, Hong Kong, China, Nov 3-7, 2019. Stroudsburg: ACL, 2019: 3290-3301.
[82] DESAI S, XU J, DURRETT G. Compressive summarization with plausibility and salience modeling[C]//Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, Nov 16-20, 2020. Stroudsburg: ACL, 2020: 6259-6274.
[83] MENDES A, NARAYAN S, MIRANDA S, et al. Jointly extracting and compressing documents with summary state representations[C]//Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Minneapolis, Jun 2-7, 2019. Stroudsburg: ACL, 2019: 3955-3966.
[84] CHEN Y C, BANSAL M. Fast abstractive summarization with reinforce-selected sentence rewriting[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, Melbourne, Jul 15-20, 2018. Stroudsburg: ACL, 2018: 675-686.
[85] XIAO L, WANG L, HE H, et al. Copy or rewrite: hybrid summarization with hierarchical reinforcement learning[C]//Proceedings of the 34th AAAI Conference on Artificial Intelligence, New York, Feb 7-12, 2020. Menlo Park: AAAI, 2020: 9306-9313.
[86] HSU W T, LIN C K, LEE M Y, et al. A unified model for extractive and abstractive summarization using inconsistency loss[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, Melbourne, Jul 15-20, 2018. Stroudsburg: ACL, 2018: 132-141.
[87] GEHRMANN S, DENG Y, RUSH A M. Bottom-up abstractive summarization[C]//Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Oct 31-Nov 4, 2018. Stroudsburg: ACL, 2018: 4098-4109.
[88] BAO G, ZHANG Y. Contextualized rewriting for text summarization[C]//Proceedings of the 35th AAAI Conference on Artificial Intelligence, Feb 2-9, 2021. Menlo Park: AAAI, 2021: 12544-12553.
[89] MA C, ZHANG W E, GUO M, et al. Multi-document summarization via deep learning techniques: a survey[J]. ACM Computing Surveys, 2023, 55(5): 102.
[90] 侯圣峦, 张书涵, 费超群. 文本摘要常用数据集和方法研究综述[J]. 中文信息学报, 2019, 33(5): 1-16.
HOU S L, ZHANG S H, FEI C Q. A survey to text summarization: popular datasets and method[J]. Journal of Chinese Information Processing, 2019, 33(5): 1-16.
[91] HERMANN K M, KOCISKY T, GREFENSTETTE E, et al. Teaching machines to read and comprehend[C]//Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, Montreal, Dec 7-12, 2015: 1693-1701.
[92] NALLAPATI R, ZHOU B, SANTOS C, et al. Abstractive text summarization using sequence-to-sequence RNNs and beyond[C]//Proceedings of the 20th SIGNLL Conference on Computational Natural Language Learning, Berlin, Aug 11-12, 2016. Stroudsburg: ACL, 2016: 280-290.
[93] EVAN S. The New York Times annotated corpus[J]. Linguistic Data Consortium, 2008, 6(12): e26752.
[94] FABBRI A R, LI I, SHE T, et al. Multi-news: a large-scale multi-document summarization dataset and abstractive hierarchical model[C]//Proceedings of the 57th Conference of the Association for Computational Linguistics, Florence, Jul 28-Aug 2, 2019. Stroudsburg: ACL, 2019: 1074-1084.
[95] FONSECA M D, ISHIKAWA E, NETO B M, et al. Tool for semantic annotation of business processes in a newsroom[C]//Proceedings of the XI Seminar on Ontology Research in Brazil and Doctoral and Masters Consortium on Ontologies, Paulo, Oct 1-3, 2018: 239-244.
[96] CHU E, LIU P J. MeanSum: a neural model for unsupervised multi-document abstractive summarization[C]//Proceedings of the 36th International Conference on Machine Learning, Long Beach, Jun 9-15, 2019: 1223-1232.
[97] HU B, CHEN Q, ZHU F. LCSTS: a large scale Chinese short text summarization dataset[C]//Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, Lisbon, Sep 17-21, 2015. Stroudsburg: ACL, 2015: 1967-1972.
[98] LIN C Y. Rouge: a package for automatic evaluation of summaries[C]//Proceedings of the Workshop on Text Summarization Branches, Barcelona, 2004. Stroudsburg: ACL, 2004: 74-81.
[99] ZHANG T, KISHORE V, WU F, et al. BERTScore: evaluating text generation with BERT[C]//Proceedings of the 8th International Conference on Learning Representations, Addis Ababa, Apr 26-30, 2020. Stroudsburg: ACL, 2020: 1-43.
[100] ZHAO W, PEYRARD M, LIU F, et al. MoverScore: text generation evaluating with contextualized embeddings and earth mover distance[C]//Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing, Hong Kong, China, Nov 3-7, 2019. Stroudsburg: ACL, 2019: 563-578.
[101] CLARK E, CELIKYILMAZ A, SMITH N A. Sentence mover??s similarity: automatic evaluation for multi-sentence texts[C]//Proceedings of the 57th Conference of the Association for Computational Linguistics, Florence, Jul 28-Aug 2, 2019. Stroudsburg: ACL, 2019: 2748-2760.
[102] MIHALCEA R, TARAU P. TextRank: bringing order into text[C]//Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing, Barcelona, Jul 25-26, 2004. Stroudsburg: ACL, 2004: 404-411.
[103] ZHANG H, LIU X, ZHANG J, DiffuSum: generation enhanced extractive summarization with diffusion[C]//Findings of the Association for Computational Linguistics: ACL 2023, Toronto, Jul 9-14, 2023. Stroudsburg: ACL, 2023: 13089-13100.
[104] ZHANG H, LIU X, ZHANG J, Extractive summarization via ChatGPT for faithful summary generation[C]//Findings of the Association for Computational Linguistics: EMNLP 2023, Singapore, Dec 6-10, 2023. Stroudsburg: ACL, 2023: 3270-3278.
[105] MISHRA N, SAHU G, CALIXTO I, et al. LLM aided semi-supervision for efficient extractive dialog summarization[C]//Findings of the Association for Computational Linguistics: EMNLP 2023, Singapore, Dec 6-10, 2023. Stroudsburg: ACL, 2023: 10002-10009.
[106] DENKOWSKI M J, LAVIE A. Meteor universal: language specific translation evaluation for any target language[C]//Proceedings of the 9th Workshop on Statistical Machine Translation, Baltimore, Jun 26-27, 2014. Stroudsburg: ACL, 2014: 376-380.
[107] MILLER G. WordNet: a lexical database for English[J]. Communications of the ACM, 1995, 38(11): 39-41.
[108] PARIDA S, MOTLíCEK P. Abstract text summarization: a low resource challenge[C]//Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing, Hong Kong, China, Nov 3-7, 2019. Stroudsburg: ACL, 2019: 5993-5997.
[109] CHEN Y, SHUAI H. Meta-transfer learning for low-resource abstractive summarization[C]//Proceedings of the 35th AAAI Conference on Artificial Intelligence, Feb 2-9, 2021. Menlo Park: AAAI, 2021: 12692-12700.
[110] JIE R, MENG X, JIANG X, et al. Unsupervised extractive summarization with learnable length control strategies[C]//Proceedings of the 38th AAAI Conference on Artificial Intelligence, Vancouver, Feb 20-27, 2024. Menlo Park: AAAI, 2024: 18372-18380.
[111] ZHAO T, HE R, XU J, et al. MultiSum: a multi-facet approach for extractive social summarization utilizing semantic and sociological relationships[C]//Proceedings of the 38th AAAI Conference on Artificial Intelligence, Vancouver, Feb 20-27, 2024. Menlo Park: AAAI, 2024: 19661-19669.
[112] VO S, VO T, LE B. Interpretable extractive text summarization with meta-learning and Bi-LSTM: a study of meta learning and explainability techniques[J]. Expert Systems with Applications, 2024, 245: 123045.
[113] ZHANG J, LU L, ZHANG L, et al. DCDSum: an interpretable extractive summarization framework based on contrastive learning method[J]. Engineering Applications of Artificial Intelligence, 2024, 133: 108148.
[114] MAO Y, QU Y, XIE Y, et al. Multi-document summarization with maximal marginal relevance-guided reinforcement learning[C]//Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing. Stroudsburg: ACL, 2020: 1737-1751.
[115] XIE Q, BISHOP J, TIWARI P, et al. Pre-trained language models with domain knowledge for biomedical extractive summarization[J]. Knowledge-Based Systems, 2022, 252: 10946.
[116] DEROY A,?GHOSH K,?GHOSH S. Ensemble methods for improving?extractive?summarization?of legal case judgements[J].?Artificial Intelligence and Law, 2024, 32:?231-289.