结合层级注意力的抽取式新闻文本自动摘要

doi:10.3778/j.issn.1673-9418.2010066

计算机科学与探索 ›› 2022, Vol. 16 ›› Issue (4): 877-887.DOI: 10.3778/j.issn.1673-9418.2010066

结合层级注意力的抽取式新闻文本自动摘要

王红斌¹^,², 金子铃¹^,², 毛存礼¹^,²^,⁺()

1.昆明理工大学信息工程与自动化学院,昆明 650500
2.昆明理工大学云南省人工智能重点实验室,昆明 650500

收稿日期:2020-10-26 修回日期:2021-01-06 出版日期:2022-04-01 发布日期:2021-02-03
通讯作者: + E-mail: maocunli@163.com
作者简介:王红斌（1983—）,男,云南曲靖人,博士, 副教授,硕士生导师,主要研究方向为智能信息系统、自然语言处理、数据分析。
金子铃（1995—）,女,云南泸西人,硕士研究生,主要研究方向为自然语言处理。
毛存礼（1977—）,男,云南曲靖人,博士,副教授,硕士生导师,CCF会员,主要研究方向为自然语言处理、信息检索、机器翻译。
基金资助:
国家自然科学基金(61966020)

Extractive News Text Automatic Summarization Combined with Hierarchical Attention

WANG Hongbin¹^,², JIN Ziling¹^,², MAO Cunli¹^,²^,⁺()

1. Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650500, China
2. Key Laboratory of Artificial Intelligence, Kunming University of Science and Technology, Kunming 650500, China

Received:2020-10-26 Revised:2021-01-06 Online:2022-04-01 Published:2021-02-03
About author:WANG Hongbin, born in 1983, Ph.D., associate professor, M.S. supervisor. His research interests include intelligent information system, natural language processing and data analysis.
JIN Ziling, born in 1995, M.S. candidate. Her research interest is natural language processing.
MAO Cunli, born in 1977, Ph.D., associate professor, M.S. supervisor, member of CCF. His research interests include natural language processing, information retrieval and machine translation.
Supported by:
National Natural Science Foundation of China(61966020)

摘要/Abstract

摘要：

由于抽取式摘要抽取句子有较强的人为判断主观性,不能准确客观评测出文章中实际每个句子对摘要的重要程度,以及每句话中每个词对句子重要程度的影响,从而影响了摘要的抽取质量。针对该问题,提出了一种结合层级注意力的抽取式新闻文本自动摘要方法。首先,该方法通过对英文新闻文本进行层级编码并依次加入词级注意力、句级注意力,得到结合层级注意力的文本表示。其次,通过神经网络构建动态打分函数并依次选择出打分函数中分值最高的候选句子作为摘要句。最后,抽取出英文新闻文本所对应的摘要。所提方法在CNN/Daily Mail、New York Times与Multi-News公共数据集上均进行了实验验证,实验结果表明所提方法的ROUGE评测值与目前最好的模型相比表现相当,ROUGE F1值较baseline分别提高了1.78、0.70与1.44个百分点。由此表明该方法在英文新闻文本抽取式摘要任务上具有泛化性与有效性,并且与现有方法相比具有一定的优越性。

关键词: 英文新闻, 抽取式摘要, 层级注意力, 打分函数

Abstract:

Extractive summarization is of strong human subjectivity, it is therefore impossible to evaluate the importance of each sentence in the article and the influence of each word on the sentence, which would affect the quality of extractive summarization. In response to this problem, this paper proposes an automatic text summarization approach to news text combined with hierarchical attention. Firstly, this method uses hierarchical coding of English news text and adds word-level attention and sentence-level attention in turn to obtain a text representation combined with hierarchical attention. Secondly, a dynamic scoring function is constructed through the neural network and the candidate sentence with the highest score in the scoring function is selected in turn as the summary sentence. Finally, the summarization is extracted corresponding to the English news text. The proposed method is experimentally verified on public datasets of CNN/Daily Mail, New York Times and Multi-News. Experimental results show that the ROUGE evaluation value of the proposed method is equivalent to the current best model, and the ROUGE F1 value is increased by 1.78, 0.70 and 1.44 percentage points respectively than the baseline, which shows that the method has generalization and effectiveness in the task of extracting English news texts, and it has certain advantages compared with the existing methods.

Key words: English news, extractive summarization, hierarchical attention, scoring function

中图分类号:

TP399

王红斌, 金子铃, 毛存礼. 结合层级注意力的抽取式新闻文本自动摘要[J]. 计算机科学与探索, 2022, 16(4): 877-887.

WANG Hongbin, JIN Ziling, MAO Cunli. Extractive News Text Automatic Summarization Combined with Hierarchical Attention[J]. Journal of Frontiers of Computer Science and Technology, 2022, 16(4): 877-887.

图/表 8

参考文献 40

[1]	NALLAPATI R, ZHAI F, ZHOU B. SummaRuNNer: a re-current neural network based sequence model for extractive summarization of documents[J]. arXiv: 1611. 04230, 2016.
[2]	REN P, CHEN Z, REN Z, et al. Leveraging contextual sen-tence relations for extractive summarization using a neural attention model[C]// Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval, Tokyo, Aug 7-11, 2017. New York: ACM, 2017: 95-104.
[3]	ZHOU Q, YANG N, WEI F, et al. Neural document sum-marization by jointly learning to score and select sentences[J]. arXiv: 1807. 02305, 2018.
[4]	WANG H, WANG X, XIONG W, et al. Self-supervised lear-ning for contextualized extractive summarization[J]. arXiv: 1906. 04466, 2019.
[5]	LIU Y. Fine-tune BERT for extractive summarization[J]. arXiv: 1903. 10318,2019.
[6]	WANG D, LIU P, ZHENG Y, et al. Heterogeneous graph neural networks for extractive document summarization[J]. arXiv: 2004. 12393, 2020.
[7]	BAHDANAU D, CHO K, BENGIO Y, Neural machine trans-lation by jointly learning to align and translate[J]. arXiv: 1409. 0473, 2014.
[8]	LUHN H P. The automatic creation of literature abstracts[J]. IBM Journal of Research and Development, 1958, 2(2):159-165. DOI URL
[9]	ERKAN G, RADEV D R. LexRank: graph-based lexical cen-trality as salience in text summarization[J]. Journal of Artificial Intelligence Research, 2004, 22:457-479. DOI URL
[10]	CAO Z Q, WEI F R, DONG L, et al. Ranking with recursive neural networks and its application to multi-document sum-marization[C]// Proceedings of the 29th AAAI Conference on Artificial Intelligence, Austin, Jan 25-30, 2015. Menlo Park: AAAI, 2015: 2153-2159.
[11]	刘娜, 路莹, 唐晓君, 等. 基于LDA重要主题的多文档自动摘要算法[J]. 计算机科学与探索, 2015, 9(2):242-248.
	LIU N, LU Y, TANG X J, et al. Multi-document summariza-tion algorithm based on significance topic of LDA[J]. Journal of Frontiers of Computer Science and Technology, 2015, 9(2):242-248.
[12]	DA CUNHA I, FERNÁNDEZ S, VELÁZQUEZ-MORALES P, et al. A new hybrid summarizer based on vector space model, statistical physics and linguistics[C]// LNCS 4827: Proceedings of the 6th Mexican International Conference on Artificial Intelligence, Aguascalientes, Nov 4-10, 2007. Berlin, Heidelberg: Springer, 2007: 872-882.
[13]	李峰, 黄金柱, 李舟军, 等. 使用关键词扩展的新闻文本自动摘要方法[J]. 计算机科学与探索, 2016, 10(3):372-380.
	LI F, HUANG J Z, LI Z J, et al. Automatic summarization method of news texts using keywords expansion[J]. Journal of Frontiers of Computer Science and Technology, 2016, 10(3):372-380.
[14]	ORASAN C. The influence of personal pronouns for auto-matic summarisation of scientific articles[C]// Proceedings of the 5th Discourse Anaphora and Anaphor Resolution Collo-quium, Azores, Sep 23-24, 2004. Berlin, Heidelberg: Springer, 2004: 127-132.
[15]	MITKOV R, EVANS R, ORĂSAN C, et al. Anaphora resolu-tion: to what extent does it help NLP applications?[C]// LNCS 4410: Proceedings of the 6th Discourse Anaphora and Ana-phor Resolution Colloquium, Lagos, Mar 29-30, 2007. Berlin, Heidelberg: Springer, 2007: 179-190.
[16]	KUPIEC J, PEDERSEN J O, CHEN F. A trainable document summarizer[C]// Proceedings of the 18th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Seattle, Jul 9-13, 1995. New York: ACM, 1995: 68-73.
[17]	SCHLESINGER J D, OKUROWSKI M E, CONROY J M, et al. Understanding machine performance in the context of human performance for multi-document summarization[C]// Proceedings of the 2002 Workshop on Automatic Summa-rization, Philadelphia, Jul 11-12, 2002. Gaithersburg: NIST, 2002: 71-77.
[18]	CONROY J M, O’LEARY D P. Text summarization via hid-den Markov models[C]// Proceedings of the 24th Annual Inter-national ACM SIGIR Conference on Research and Develop-ment in Information Retrieval, New Orleans, Sep 9-13, 2001. New York: ACM, 2001: 406-407.
[19]	AONE C, OKUROWSKI M E, GORLINSKY J, et al. Trai-nable, scalable summarization using robust NLP and machine learning[C]// Proceedings of the 36th Annual Meeting of the Association for Computational Linguistics and 17th Inter-national Conference on Computational Linguistics, Montreal, Aug 10-14, 1998. New York: ACM, 1998: 62-66.
[20]	SVORE K M, VANDERWENDE L, BURGES C J C. Enha-ncing single-document summarization by combining RankNet and third-party sources[C]// Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, Prague, Jun 28-30, 2007. Stroudsburg: ACL, 2007: 448-457.
[21]	BURGES C J C, SHAKED T, RENSHAW E, et al. Learning to rank using gradient descent[C]// Proceedings of the 22nd International Conference on Machine Learning, Bonn, Aug 7-11, 2005. New York: ACM, 2005: 89-96.
[22]	SCHILDER F, KONDADADI R. FastSum: fast and accurate query-based multi-document summarization[C]// Proceedings of the 46th Annual Meeting of the Association for Computa-tional Linguistics on Human Language Technologies, Colum-bus, Jun 15-20, 2008. New York: ACM, 2008: 205-208.
[23]	LI S, OUYANG Y, WANG W, et al. Multi document sum-marization using support vector regression[C]// Proceedings of the 2007 Document Understanding Conference, Rochester, Apr 2007. New York: Academic, 2007: 45-50.
[24]	MANI I, BLOEDORN E. Multi-document summarization by graph search and matching[C]// Proceedings of the 14th National Conference on Artificial Intelligence and 9th Innova-tive Applications of Artificial Intelligence Conference, Pro-vidence, Jul 27-31, 1997. Menlo Park: AAAI, 1997: 622-628.
[25]	WAN X J, XIAO J G. Towards a unified approach based on affinity graph to various multi-document summarizations[C]// LNCS 4675: Proceedings of the 11th European Conference on Research & Advanced Technology for Digital Libraries, Budapest, Sep 16-21, 2007. Berlin, Heidelberg: Springer, 2007: 297-308.
[26]	GIANNAKOPOULOS G, KARKALETSIS V, VOUROS G A. Testing the use of n-gram graphs in summarization sub-tasks[C]// Proceedings of the 2008 Text Analysis Conference, Gaithersburg, Nov 17-19, 2008: 324-334.
[27]	MORALES L P, ESTEBAN A D, GERVAS P. Concept graph based biomedical automatic summarization using ontologies[C]// Proceedings of the 1st Text Analysis Conference, Gai-thersburg, Nov 17-19, 2008: 53-56.
[28]	ERKAN G, RADEV D R. LexPageRank: prestige in multi-document text summarization[C]// Proceedings of the 2004 Conference on Empirical Methods in Natural Language Pro-cessing, Barcelona, Jul 25-26, 2004. Stroudsburg: ACL, 2004: 365-371.
[29]	ERKAN G, RADEV D R. LexPageRank: prestige in multi-document text summarization[C]// Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing, Barcelona, Jul 25-26, 2004. Stroudsburg: ACL, 2004: 365-371.
[30]	MIHALCEA R. Graph-based ranking algorithms for sentence extraction, applied to text summarization[C]// Proceedings of the 42nd Annual Meeting of the Association for Computa-tional Linguistics, Barcelona, Jul 21-26, 2004. Stroudsburg: ACL, 2004: 170-173.
[31]	YU L T, ZHANG W N, WANG J, et al. SeqGAN: sequence generative adversarial nets with policy gradient[C]// Procee-dings of the 31st AAAI Conference on Artificial Intelligence, San Francisco, Feb 4-9, 2017. Menlo Park: AAAI, 2017: 2852-2858.
[32]	AL-SABAHI K, ZHANG Z P, NADHER M. A hierarchical structured self-attentive model for extractive document sum-marization (HSSAS)[J]. IEEE Access, 2018, 6:24205-24212. DOI URL
[33]	ZHANG X X, WEI F R, ZHOU M. HIBERT: document level pre-training of hierarchical bidirectional transformers for document summarization[J]. arXiv: 1905. 06566, 2019.
[34]	ZHONG M, LIU P F, CHEN Y R, et al. Extractive sum-marization as text matching[J]. arXiv: 2004. 08795, 2020.
[35]	HERMANN K M, KOCISKÝ T, GREFENSTETTE E, et al. Teaching machines to read and comprehend[C]// Proceedings of the Annual Conference on Neural Information Processing Systems 2015, Montreal, Dec 7-12, 2015. Red Hook: Curran Associates, 2015: 1693-1701.
[36]	FABBRI A R, LI I, SHE T, et al. Multi-news: a large-scale multi-document summarization dataset and abstractive hiera-rchical model[J]. arXiv: 1906. 01749, 2019.
[37]	LIN C Y. ROUGE: a package for automatic evaluation of summaries[C]// Proceedings of the 2004 Workshop on Text Summarization Branches Out, Barcelona, Jul 2004. Stroud-sburg: ACL, 2004: 74-81.
[38]	MIHALCEA R, TARAU P. TextRank: bringing order into text[C]// Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing, Barcelona, Jul 25-26, 2004. Stroudsburg: ACL, 2004: 404-411.
[39]	CHENG J, LAPATA M. Neural summarization by extracting sentences and words[J]. arXiv: 1603. 07252, 2016.
[40]	SEE A, LIU P J, MANNING C D. Get to the point: summari-zation with pointer-generator networks[J]. arXiv: 1704. 04368, 2017.

数据集	新闻篇章数	参考摘要数
训练集	287 227	287 227
验证集	13 368	13 368
测试集	11 490	11 490

数据集	新闻篇章数	参考摘要数
训练集	287 227	287 227
验证集	13 368	13 368
测试集	11 490	11 490

数据集	新闻篇章数	参考摘要数
训练集	137 900	137 900
验证集	2 000	2 000
测试集	9 934	9 934

数据集	新闻篇章数	参考摘要数
训练集	137 900	137 900
验证集	2 000	2 000
测试集	9 934	9 934

数据集	新闻篇章数	参考摘要数
训练集	44 972	44 972
验证集	5 622	5 622
测试集	5 622	5 622

结合层级注意力的抽取式新闻文本自动摘要

Extractive News Text Automatic Summarization Combined with Hierarchical Attention

RichHTML

PDF

可视化

摘要/Abstract

引用本文

使用本文

图/表 8

参考文献 40

相关文章 3

编辑推荐

Metrics

Models	ROUGE-1	ROUGE-2	ROUGE-L
LEAD3	40.24	17.70	36.45
TextRank	40.20	17.56	36.44
CRSum	40.52	18.08	36.81
NN-SE	41.13	18.59	37.40
PGN	39.53	17.28	36.38
NeuSum	41.59	19.01	37.98
HSG+Tri-Blocking	42.95	19.76	39.23
NeuSum+sentAtt	43.05	19.82	37.71
NeuSum+WordAtt	43.06	19.87	38.64
NeuSum+hieAtt	43.37	20.13	38.89

Models	ROUGE-1	ROUGE-2	ROUGE-L
LEAD3	31.17	15.59	27.86
TextRank	32.38	16.27	28.93
CRSum	31.46	15.50	27.96
NN-SE	36.66	19.88	32.97
NeuSum	37.71	20.49	33.92
NeuSum+sentAtt	37.39	20.42	33.31
NeuSum+WordAtt	38.89	20.72	34.09
NeuSum+hieAtt	38.41	20.99	33.09

Models	ROUGE-1	ROUGE-2	ROUGE-L
LEAD3	43.08	14.27	38.97
NeuSum	43.47	16.60	39.09
NeuSum+sentAtt	44.54	17.17	39.37
NeuSum+WordAtt	44.82	17.06	36.85
NeuSum+hieAtt	44.91	18.06	39.35

[1]	周伟枭, 蓝雯飞, 许智明, 朱容波. SFExt-PGAbs:两阶段长文档摘要模型[J]. 计算机科学与探索, 2021, 15(5): 907-921.
[2]	任建华, 李静, 孟祥福. 上下文感知与层级注意力网络的文档分类方法[J]. 计算机科学与探索, 2021, 15(2): 305-314.
[3]	邢长征，郭亚兰，张全贵，赵宏宝. 融合短文本层级注意力和时间信息的推荐方法[J]. 计算机科学与探索, 2021, 15(11): 2222-2232.