Low Resource Summarization Model Based on Latent Structural Semantic En-hancement

doi:10.3778/j.issn.1673-9418.2205064

Abstract

Abstract: At present, low-resource summary generation tasks are usually processed by data enhancement or pre-training combined with fine-tuning, which cannot make full use of the latent structural semantic information between the source text and the target summary. For this reason, this paper proposes a low resource summary model based on latent structural semantic enhancement, which enhances the utilization of structured information in the way of graph structure alignment. First of all, the model obtains the latent semantic features of the source text and prediction summary through the structural feature representation layer. Then, the obtained semantic features are aligned with the latent structured alignment module for node alignment and edge alignment, which helps the model to capture the structured information in the semantic features, thus enhancing the model??s use of structured knowledge. Finally, the model uses the structured feature alignment distance between the source text and the prediction summary as the regular term of target loss to assist the model in optimization. Experiments are performed on a low-resource dataset across six domains. The model achieves an average improvement of 0.58 in ROUGE-1 scores relative to the baseline model. The results show that the model can effectively improve the ability of generating low-resource summaries by using latent structured semantic knowledge.

Key words: low resources, structured, semantic features, graph structure

摘要： 当前低资源摘要生成任务通常采用数据增强或预训练结合微调的方式进行处理，对于源文本与目标摘要之间的潜层结构化语义信息未能充分利用。为此，提出一种基于潜层结构化语义增强的低资源摘要模型，以图结构对齐的方式增强模型对结构化信息的利用。首先，该模型通过结构特征表示层获取源文本与预测摘要的潜层结构化语义特征。然后，将获得的语义特征利用潜层结构对齐模块进行节点对齐和边对齐，这种对齐有助于模型捕捉语义特征中的结构化信息，从而增强模型对结构化知识的利用。最后，利用源文本与预测摘要之间的结构化特征对齐距离作为目标损失的正则项来辅助模型进行优化。在六个领域的低资源数据集上进行实验，ROUGE-1分值相对于基线模型平均提高了0.58。结果表明利用潜层结构化语义知识可以有效提高低资源摘要生成的能力。

关键词: 低资源, 结构化, 语义特征, 图结构

LIU Yu, LIU Xiaoming, LIU Weiguang, YANG Guan, LIU Jie. Low Resource Summarization Model Based on Latent Structural Semantic En-hancement[J]. Journal of Frontiers of Computer Science and Technology, 2023, 17(8): 1961-1973.

刘宇, 刘小明, 刘卫光, 杨关, 刘杰. 基于潜层结构化语义增强的低资源摘要模型[J]. 计算机科学与探索, 2023, 17(8): 1961-1973.

References

[1] LIU Y, LAPATA M. Text summarization with pretrained en-coders[C]//Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th Inter-national Joint Conference on Natural Language Processing,Hong Kong, China, Nov 3-7, 2019. Stroudsburg: ACL, 2019: 3728-3738.
[2] LEWIS M, LIU Y, GOYAL N, et al. BART: denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Lin-guistics, Jul 5-10, 2020. Stroudsburg: ACL, 2020: 7871-7880.
[3] 王红斌, 金子铃, 毛存礼. 结合层级注意力的抽取式新闻文本自动摘要[J]. 计算机科学与探索, 2022, 16(4): 877-887.
WANG H B, JIN Z L, MAO C L. Extractive news text automatic summarization combined with hierarchical attention[J]. Journal of Frontiers of Computer Science and Techno-logy, 2022, 16(4): 877-887.
[4] FENG S Y, GANGAL V, WEI J, et al. A survey of data aug-mentation approaches for NLP[C]//Findings of the Associa-tion for Computational Linguistics, Aug 1-6, 2021. Strouds-burg: ACL, 2021: 968-988.
[5] PARIDA S, MOTLICEK P. Abstract text summarization: a low resource challenge[C]//Proceedings of the 2019 Con-ference on Empirical Methods in Natural Language Pro-cessing and the 9th International Joint Conference on Natural Language Processing, Hong Kong, China, Nov 3-7, 2019.Stroudsburg: ACL, 2019: 5993-5997.
[6] 邱锡鹏. 神经网络与深度学习[M]. 北京: 机械工业出版社, 2020.
QIU X P. Neural network and deep learning[M]. Beijing: China Machine Press, 2020.
[7] CHEN Y, SHUAI H. Meta-transfer learning for low-resource abstractive summarization[C]//Proceedings of the 35th AAAI Conference on Artificial Intelligence, the 33rd Conference on Innovative Applications of Artificial Intelligence, the 11th Symposium on Educational Advances in Artificial Intel-ligence, Feb 2-9, 2021. Menlo Park: AAAI, 2021: 12692-12700.
[8] 苗国义, 刘明童, 陈钰枫, 等. 融合小句对齐知识的汉英神经机器翻译[J]. 北京大学学报(自然科学版), 2022, 58(1): 61-68.
MIAO G Y, LIU M T, CHEN Y F, et al. Incorporating clause alignment knowledge into Chinese-English neural machine translation[J]. Acta Scientiarum Naturalium Universitatis Pekinensis, 2022, 58(1): 61-68.
[9] RUSH A M, CHOPRA S, WESTON J. A neural attention model for abstractive sentence summarization[C]//Proceedings of the 2015 Conference on Empirical Methods in Natural Lan-guage Processing, Lisbon, Sep 17-21, 2015. Stroudsburg: ACL, 2015: 379-389.
[10] SHEN X, ZHAO Y, SU H, et al. Improving latent alignment in text summarization by generalizing the pointer generator[C]//Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, Hong Kong, China, Nov 3-7, 2019. Stroudsburg: ACL, 2019: 3760-3771.
[11] CHEN L, GAN Z, CHENG Y, et al. Graph optimal transport for cross-domain alignment[C]//Proceedings of the 37th In-ternational Conference on Machine Learning, Jul 13-18, 2020: 1542-1553.
[12] YU T, LIU Z, FUNG P. AdaptSum: towards low-resource domain adaptation for abstractive summarization[C]//Procee-dings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Jun 6-11, 2021. Stroudsburg: ACL, 2021: 5892-5904.
[13] SUTSKEVER I, VINYALS O, LE Q V. Sequence to sequence learning with neural networks[C]//Advances in Neural In-formation Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, Montreal, Dec 8-13, 2014: 3104-3112.
[14] SEE A, LIU P J, MANNING C D. Get to the point: sum-marization with pointer generator networks[C]//Proceedings of the 55th Annual Meeting of the Association for Computa-tional Linguistics, Vancouver, Jul 30-Aug 4, 2017. Strouds-burg: ACL, 2017: 1073-1083.
[15] TU Z, LU Z, LIU Y, et al. Modeling coverage for neural machine translation[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, Berlin, Aug 7-12, 2016. Stroudsburg: ACL, 2016: 1-11.
[16] DEVLIN J, CHANG M, LEE K, et al. BERT: pre-training of deep bidirectional transformers for language understanding[J]. arXiv:1810.04805, 2018.
[17] RADFORD A, NARASIMHAN K, SALIMANS T, et al. Improving language understanding by generative pre-training[EB/OL]. [2022-02-25]. https://s3-us-west-2.amazonaws.com/openai-assets/research-covers/language-unsupervised/language_understanding_paper.pdf.
[18] CHINEA-RIOS M, PERIS A, CASACUBERTA F. Adapting neural machine translation with parallel synthetic data[C]//Proceedings of the 2nd Conference on Machine Translation, Copenhagen, Sep 7-8, 2017. Stroudsburg: ACL, 2017: 138-147.
[19] HOANG V C D, KOEHN P, HAFFARI G, et al. Iterative back-translation for neural machine translation[C]//Procee-dings of the 2nd Workshop on Neural Machine Translation and Generation, Melbourne, Jul 20, 2018. Stroudsburg: ACL, 2018: 18-24.
[20] HUANG L, WU L, WANG L. Knowledge graph-augmented abstractive summarization with semantic-driven cloze reward[C]//Proceedings of the 58th Annual Meeting of the Asso-ciation for Computational Linguistics, Jul 5-10, 2020. Strouds-burg: ACL, 2020: 5094-5107.
[21] ZHU C, HINTHORN W, XU R, et al. Boosting factual correctness of abstractive summarization with knowledge graph[J]. arXiv:2003.08612, 2020.
[22] WOLF T, DEBUT L, SANH V, et al. Transformers: state-of-the-art natural language processing[C]//Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, Nov 16-20, 2020. Strouds-burg: ACL, 2020: 38-45.
[23] GOODFELLOW I, BENGIO Y, COURVILLE A. Deep learning[M]. Cambridge: MIT Press, 2016.
[24] GORI M, MONFARDINI G, SCARSELLI F. A new model for learning in graph domains[C]//Proceedings of the 2005 IEEE International Joint Conference on Neural Networks, Montreal, Jul 31-Aug 4, 2005. Piscataway: IEEE, 2005: 729-734.
[25] DUVENAUD D, MACLAURIN D, AGUILERA-IPARRAGUIRRE J, et al. Convolutional networks on graphs for learning molecular fingerprints[C]//Advances in Neural Information Processing Systems 28: Annual Conference on Neural In-formation Processing Systems 2015, Dec 7-12, 2015: 2224-2232.
[26] LI Y, GU C, DULLIEN T, et al. Graph matching networks for learning the similarity of graph structured objects[C]//Proceedings of the 36th International Conference on Ma-chine Learning, Long Beach, Jun 9-15, 2019: 3835-3845.
[27] VELICKOVIC P, CUCURULL G, CASANOVA A, et al. Graph attention networks[C]//Proceedings of the 6th Inter-national Conference on Learning Representations, Vancouver, Apr 30-May 3, 2018: 1-12.
[28] VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]//Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Pro-cessing Systems 2017, Long Beach, Dec 4-9, 2017: 5998-6008.
[29] PEYRé G, CUTURI M. Computational optimal transport[J]. Foundations and Trends in Machine Learning, 2019, 11 (5/6): 355-602.
[30] PEYR E G, CUTURI M, SOLOMON J. Gromov-Wasserstein averaging of kernel and distance matrices[C]//Proceedings of the 33rd International Conference on Machine Learning,New York, Jun 19-24, 2016: 2664-2672.
[31] GLIWA B, MOCHOL I, BIESEK M, et al. SAMSum corpus: a human-annotated dialogue dataset for abstractive sum-marization[C]//Proceedings of the 2nd Workshop on New Frontiers in Summarization, 2019: 70-79.
[32] ZHANG R, TETREAULT J. This email could save your life: introducing the task of email subject line generation[C]//Proceedings of the 57th Conference of the Association for Computational Linguistics, Florence, Jul 28-Aug 2, 2019. Stroudsburg: ACL, 2019: 446-456.
[33] YASUNAGA M, KASAI J, ZHANG R, et al. ScisummNet: a large annotated corpus and content-impact models for scientific paper summarization with citation networks[C]//Proceedings of the 33rd AAAI Conference on Artificial In-telligence, the 31st Innovative Applications of Artificial In-telligence Conference, the 9th AAAI Symposium on Edu-cational Advances in Artificial Intelligence, Honolulu, Jan 27-Feb 1, 2019. Stroudsburg: ACL, 2019: 7386-7393.
[34] WANG L, LING W. Neural network-based abstract generation for opinions and arguments[C]//Proceedings of the 2016 Conference of the North American Chapter of the Associa-tion for Computational Linguistics: Human Language Tech-nologies, San Diego, Jun 12-17, 2016. Stroudsburg: ACL, 2016: 47-57.
[35] KIM B, KIM H, KIM G. Abstractive summarization of reddit posts with multi-level memory networks[C]//Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Minneapolis, Jun 2-7, 2019. Stroudsburg: ACL, 2019: 2519-2531.
[36] LIN C, HOVY E. Automatic evaluation of summaries using n-gram co-occurrence statistics[C]//Proceedings of the Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics, Edmonton, May 27-Jun 1, 2003. Stroudsburg: ACL, 2003: 150-157.
[37] NARAYAN S, COHEN S B, LAPATA M. Don??t give me the details, just the summary! Topic-aware convolutional neural networks for extreme summarization[C]//Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Oct 31-Nov 4, 2018. Strouds-burg: ACL, 2018: 1797-1807.
[38] CHEN Y, BANSAL M. Fast abstractive summarization with reinforce-selected sentence rewriting[C]//Proceedings of the 56th Annual Meeting of the Association for Computa-tional Linguistics, Melbourne, Jul 15-20, 2018. Stroudsburg: ACL, 2018: 675-686.
[39] FENG X, FENG X, QIN B. Incorporating commonsense knowledge into abstractive dialogue summarization via he-terogeneous graph networks[C]//LNCS 12869: Proceedings of the 20th China National Conference on Chinese Compu-tational Linguistics, Hohhot, Aug 13-15, 2021. Cham: Springer, 2021: 127-142.
[40] ZHAO L, XU W, GUO J. Improving abstractive dialogue summarization with graph structures and topic words[C]//Proceedings of the 28th International Conference on Com-putational Linguistics, Barcelona, Dec 8-13, 2020: 437-449.
[41] ZOU Y, ZHU B, HU X, et al. Low-resource dialogue sum-marization with domain-agnostic multi-source pretraining[C]//Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, Nov 7-11, 2021. Strouds-burg: ACL, 2021: 80-91.