自动扩充关键词语义信息的诗歌生成算法

doi:10.3778/j.issn.1673-9418.2109075

摘要/Abstract

摘要： 当前，诗歌生成模型大多数通过用户所提供的关键词来生成符合韵律规则和音调起伏的诗歌。由于关键词蕴含的语义信息较少，很难保证生成诗歌的质量，容易出现上下文主题偏移的现象。针对这一问题，提出了一种基于条件变分自编码器的生成模型，该模型能够在更加丰富的语义信息指导下，生成更符合关键字描述和用户满意度的诗歌。该模型通过采样人类创作的诗歌，引入额外和关键词相关的语义信息，有效估计条件变分自编码器的先验概率分布，生成更贴合真实分布的先验概率。由于该模型自动扩充了关键词信息，缩小了输入和输出语义信息的差距，缓解了以往模型中普遍存在的过翻译问题。实验结果表明，该模型无论在自动评估还是人类评估方面相比其他模型都有更好的效果，并成功减少了过翻译问题出现的频率，提高了生成诗歌的流畅性。通过变化采样的范围，成功实现了对生成诗歌写作风格的控制，进一步证明了该算法的有效性。

关键词: 自然语言处理, 自然语言生成, 诗歌生成, 条件变分自编码器

Abstract: At present, most of the poetry generation models use keywords provided by users to generate poems that conform to the rules of rhythm and fluctuations in pitch. Because keywords contain less semantic information, it is difficult to guarantee the quality of generated poems, and the phenomenon of contextual theme shift is likely to occur. In response to this problem, this paper proposes a generative model based on conditional variational autoencoders, which can generate poems that are more in line with keyword descriptions and user satisfaction under the guidance of richer semantic information. By sampling human poetry and introducing additional semantic information related to keywords, the model effectively estimates the prior probability distribution of the conditional variational autoencoder, and generates a prior probability that more closely matches the true distribution. Because this model automatically expands keyword information, it narrows the gap between input and output semantic information, and alleviates the over-translation problem that is common in previous models. Experimental results show that the proposed model has better results than other models in both automatic and human evaluation, successfully reduces the frequency of over-translation problems and improves the fluency of generated poetry. By changing the range of sampling, controlling the writing style of the generated poetry is successfully achieved, which further shows the effectiveness of the algorithm proposed in this paper.

Key words: natural language processing, natural language generation, poetry generation, conditional variational autoencoder

王勇超, 周灵智, 赵亚萍, 许端清. 自动扩充关键词语义信息的诗歌生成算法[J]. 计算机科学与探索, 2023, 17(6): 1387-1394.

WANG Yongchao, ZHOU Lingzhi, ZHAO Yaping, XU Duanqing. Poetry Generation Algorithm with Automatic Expansion of Keyword Semantic Information[J]. Journal of Frontiers of Computer Science and Technology, 2023, 17(6): 1387-1394.

参考文献

[1] ZHANG X X, LAPATA M. Chinese poetry generation with recurrent neural networks[C]//Proceedings of the 2014 Con-ference on Empirical Methods in Natural Language Processing, Doha, Oct 25-29, 2014. Stroudsburg: ACL, 2014: 670-680.
[2] HE J, ZHOU M, JIANG L. Generating Chinese classical poems with statistical machine translation models[C]//Proceedings of the 26th AAAI Conference on Artificial Intelligence, Toronto, Jul 22-26, 2012. Menlo Park: AAAI, 2012: 1650-1656.
[3] WANG Q X, LUO T Y, WANG D, et al. Chinese song iambics generation with neural attention-based model[C]//Proceedings of the 25th International Joint Conference on Artificial Intelligence, New York, Jul 9-15, 2016. Menlo Park: AAAI, 2016: 2943-2949.
[4] BAHDANAU D, CHO K, BENGIO Y. Neural machine translation by jointly learning to align and translate[J].arXiv:1409.0473, 2014.
[5] YAN R, JIANG H, LAPATA M, et al. i, Poet: automatic Chinese poetry composition through a generative summarization framework under constrained optimization[C]//Proceedings of the 23rd International Joint Conference on Artificial Intelligence, Beijing, Aug 3-9, 2013. Menlo Park: AAAI,2013: 2197-2203.
[6] WANG Z, HE W, WU H, et al. Chinese poetry generation with planning based neural network[C]//Proceedings of the 26th International Conference on Computational Linguistics, Osaka, Dec 11-16, 2016. Stroudsburg: ACL, 2016: 1051-1060.
[7] ZHANG J Y, FENG Y, WANG D, et al. Flexible and creative Chinese poetry generation using neural memory[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, Vancouver, Jul 30-Aug 4, 2017. Stroudsburg: ACL, 2017: 1364-1373.
[8] YI X Y, SUN M S, LI R Y, et al. Chinese poetry generation with a working memory model[C]//Proceedings of the 27th International Joint Conference on Artificial Intelligence, Stockholm, Jul 13-19, 2018: 4553-4559.
[9] KINGMA D, WELLING M. Auto-encoding variational Bayes[J]. arXiv:1312.6114, 2013.
[10] BOWMAN S R, VILNIS L, VINYALS O, et al. Generating sentences from a continuous space[C]//Proceedings of the 20th SIGNLL Conference on Computational Natural Language Learning, Berlin, Aug 11-12, 2016. Stroudsburg: ACL, 2016: 10-21.
[11] YAN X C, YANG J M, SOHN K, et al. Attribute2Image: conditional image generation from visual attributes[C]//LNCS 9908: Proceedings of the 14th European Conference on Computer Vision, Amsterdam, Oct 11-14, 2016. Cham:Springer, 2016: 776-791.
[12] YANG X P, LIN X W, SUO S D, et al. Generating thematic Chinese poetry using conditional variational autoencoders with hybrid decoders[C]//Proceedings of the 27th International Joint Conference on Artificial Intelligence, Stockholm, Jul 13-19, 2018: 4539-4545.
[13] LI J T, SONG Y, ZHANG H S, et al. Generating classical Chinese poems via conditional variational autoencoder and adversarial training[C]//Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Oct 31-Nov 4, 2018. Stroudsburg: ACL, 2018: 3890-3900.
[14] YANG C, SUN M S, YI X Y, et al. Stylistic Chinese poetry generation via unsupervised style disentanglement[C]// Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Oct 31-Nov 4, 2018. Stroudsburg: ACL, 2018: 3960-3969.
[15] YI X, LI R, YANG C, et al. MixPoet: diverse poetry generation via learning controllable mixed latent space[C]//Proceedings of the 34th AAAI Conference on Artificial Intelligence, the 32nd Innovative Applications of Artificial Intelligence Conference, the 10th AAAI Symposium on Educational Advances in Artificial Intelligence, New York, Feb 7-12, 2020. Menlo Park: AAAI, 2020: 9450-9457.
[16] YANG Z C, CAI P S, FENG Y S, et al. Generating classical Chinese poems from vernacular chinese[C]//Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, Hong Kong, China, Nov 3-7, 2019. Stroudsburg: ACL, 2019: 6154-6163.
[17] BANDURA B. Social cognitive theory: an agentic perspective[J]. Annual Review of Psychology, 2001, 52: 1-26.
[18] GUO Z P, YI X Y, SUN M S, et al. Jiuge: a human-machine collaborative Chinese classical poetry generation system[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: System Demonstrations, Florence, Jul 28-Aug 2, 2019. Stroudsburg: ACL, 2019: 25-30.
[19] PAPINENI S. BLEU: a method for automatic evaluation of machine translation[C]//Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, Philadelphia, Jul 6-12, 2002. Stroudsburg: ACL, 2002: 311-318.