Overview of Knowledge Graph Question Answering Enhanced by Large Language Models

doi:10.3778/j.issn.1673-9418.2407069

Abstract

Abstract: Knowledge graph question answering (KGQA) is a technology that retrieves relevant answers from a knowledge graph by processing natural language questions posed by users. Early KGQA technologies were limited by the size of knowledge graphs, computational power, and natural language processing capabilities, resulting in lower accuracy. In recent years, with advancements in artificial intelligence, particularly the development of large language models (LLMs), KGQA technology has achieved significant improvements. LLMs such as GPT-3 have been widely applied to enhancing the performance of KGQA. To better study and learn the enhanced KGQA technologies, this paper summarizes various methods using LLMs for KGQA. Firstly, the relevant knowledge of LLMs and KGQA is summarized, including the technical principles and training methods of LLMs, as well as the basic concepts of knowledge graphs, question answering, and KGQA. Secondly, existing methods of enhancing KGQA with LLMs are reviewed from two dimensions: semantic parsing and information retrieval. The problems that these methods address and their limitations are analyzed. Additionally, related resources and evaluation methods for KGQA enhanced by LLMs are collected and organized, and the performance of existing methods is summarized. Finally, the limitations of current methods are analyzed, and future research directions are proposed.

Key words: large language model, knowledge graph question answering, semantic parsing, information retrieval

摘要： 知识图谱问答（knowledge graph question answering，KGQA）是一种通过处理用户提出的自然语言问题，从知识图谱中获取相关答案的技术。早期的知识图谱问答技术受到知识图谱规模、计算能力以及自然语言处理能力的限制，准确率较低。近年来，随着人工智能技术的进步，特别是大语言模型（large language model，LLM）的发展，知识图谱问答技术的性能得到显著提升。大语言模型如GPT-3等已经被广泛应用于增强知识图谱问答的性能。为了更好地研究学习增强知识图谱问答的技术，对现有的各种大语言模型增强的知识图谱问答方法进行了归纳分析。总结了大语言模型和知识图谱问答的相关知识，即大语言模型的技术原理、训练方法，以及知识图谱、问答和知识图谱问答的基本概念。从语义解析和信息检索两个维度，综述了大语言模型增强知识图谱问答的现有方法，分析了方法所解决的问题及其局限性。收集整理了大语言模型增强知识图谱问答的相关资源和评测方法，并对现有方法的性能表现进行了总结。最后针对现有方法的局限性，分析并提出了未来的重点研究方向。

关键词: 大语言模型, 知识图谱问答, 语义解析, 信息检索

FENG Tuoyu, LI Weiping, GUO Qinglang, WANG Gangliang, ZHANG Yusong, QIAO Zijian. Overview of Knowledge Graph Question Answering Enhanced by Large Language Models[J]. Journal of Frontiers of Computer Science and Technology, 2024, 18(11): 2887-2900.

冯拓宇, 李伟平, 郭庆浪, 王刚亮, 张雨松, 乔子剑. 大语言模型增强的知识图谱问答研究进展综述[J]. 计算机科学与探索, 2024, 18(11): 2887-2900.

References

[1] WANG X, GAO T, ZHU Z, et al. KEPLER: a unified model for knowledge embedding and pre-trained language representation[J]. Transactions of the Association for Computational Linguistics, 2021, 9: 176-194.
[2] YAO L, MAO C, LUO Y. KG-BERT: BERT for knowledge graph completion[EB/OL]. [2024-06-08]. https://arxiv.org/abs/1909.03193.
[3] MELNYK I, DOGNIN P, DAS P. Grapher: multi-stage knowledge graph construction using pretrained language models[C]//NeurIPS 2021 Workshop on Deep Generative Models and Downstream Applications, Dec 13-14, 2021.
[4] KE P, JI H, RAN Y, et al. JointGT: graph-text joint representation learning for text generation from knowledge graphs[C]//Findings of the Association for Computational Linguistics: ACL/IJCNLP 2021, Aug 1-6, 2021. Stroudsburg: ACL, 2021: 2526-2538.
[5] JIANG J, ZHOU K, ZHAO W X, et al. UniKGQA: unified retrieval and reasoning for solving multi-hop question answering over knowledge graph[C]//Proceedings of the 11th International Conference on Learning Representations, Kigali, May 1-5, 2023.
[6] VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems, Cali-fornia, Dec 4-9, 2017. Red Hook: Curran Associates Inc, 2017: 6000-6010.
[7] 吕国豪, 罗四维, 黄雅平, 等. 基于卷积神经网络的正则化方法[J]. 计算机研究与发展, 2014, 51(9): 1891-1900.
Lü G H, LUO S W, HUANG Y P, et al. A novel regularization method based on convolution neural network[J]. Journal of Computer Research and Development, 2014, 51(9): 1891-1900.
[8] BALDI P, SADOWSKI P J. Understanding dropout[C]//Adva-nces in Neural Information Processing Systems 26: Annual Conference on Neural Information Processing Systems 2013, Lake Tahoe, Dec 5-8, 2013: 2814-2822.
[9] SONG J, YIM J, JUNG J, et al. Optimus-CC: efficient large NLP model training with 3D parallelism aware communication compression[C]//Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Vancouver, Mar 25-29, 2023. New York: ACM, 2023: 560-573.
[10] HU E J, SHEN Y, WALLIS P, et al. LoRA: low-rank adaptation of large language models[C]//Proceedings of the 10th International Conference on Learning Representations, Apr 25-29, 2022.
[11] LIU X, JI K, FU Y, et al. P-Tuning v2: prompt tuning can be comparable to fine-tuning universally across scales and tasks[EB/OL]. [2024-06-10]. https://arxiv.org/abs/2110.07602.
[12] ASKELL A, BAI Y, CHEN A, et al. A general language assistant as a laboratory for alignment[EB/OL]. [2024-06-10].https://arxiv.org/abs/2112.00861.
[13] KAUFMANN T, WENG P, BENGS V, et al. A survey of reinforcement learning from human feedback[EB/OL]. [2024-06-10]. https://arxiv.org/abs/2312.14925.
[14] RAFAILOV R, SHARMA A, MITCHELL E, et al. Direct preference optimization: your language model is secretly a reward model[C]//Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, New Orleans, Dec 10-16, 2023.
[15] DEVLIN J, CHANG M W, LEE K, et al. BERT: pre-training of deep bidirectional transformers for language understanding[C]//Proceedings of the 2019 Conference of the North Ame-rican Chapter of the Association for Computational Linguistics: Human Language Technologies, Minneapolis, Jun 2-7, 2019. Stroudsburg: ACL, 2019.
[16] RADFORD A, NARASIMHAN K, SALIMANS T, et al. Improving language understanding by generative pre-training[EB/OL]. [2024-06-10]. https://api.semanticscholar.org/CorpusID:49313245.
[17] 刘峤, 李杨, 段宏, 等. 知识图谱构建技术综述[J]. 计算机研究与发展, 2016, 53(3): 582-600.
LIU J, LI Y, DUAN H, et al. Knowledge graph construction techniques[J]. Journal of Computer Research and Development, 2016, 53(3): 582-600.
[18] 萨日娜, 李艳玲, 林民. 知识图谱推理问答研究综述[J]. 计算机科学与探索, 2022, 16(8): 1727-1741.
SARINA, LI Y L, LIN M. Survey of question answering based on knowledge graph reasoning[J]. Journal of Frontiers of Computer Science and Technology, 2022, 16(8): 1727-1741.
[19] 陈子睿, 王鑫, 王林, 等. 开放领域知识图谱问答研究综述[J]. 计算机科学与探索, 2021, 15(10): 1843-1869.
CHEN Z R, WANG X, WANG L, et al. Survey of open-domain knowledge graph question answering[J]. Journal of Frontiers of Computer Science and Technology, 2021, 15(10): 1843-1869.
[20] KWIATKOWKSI T, ZETTLEMOYER L, GOLDWATER S, et al. Inducing probabilistic CCG grammars from logical form with higher-order unification[C]//Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, Cambridge, 2010. Stroudsburg: ACL, 2010: 1223-1233.
[21] BERANT J, LIANG P. Semantic parsing via paraphrasing[C]//Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Baltimore, Jun 22-27, 2014. Stroudsburg: ACL, 2014: 1415-1425.
[22] LUO H, TANG Z, PENG S, et al. ChatKBQA: a generate-then-retrieve framework for knowledge base question answering with fine-tuned large language models[EB/OL]. [2024-06-10]. https://arxiv.org/abs/2310.08975.
[23] LI Z, DENG L, LIU H, et al. UniOQA: a unified framework for knowledge graph question answering with large language models[EB/OL]. [2024-07-03]. https://arxiv.org/abs/2406.02110.
[24] TAFFA T A, USBECK R. Leveraging LLMs in scholarly knowledge graph question answering[EB/OL]. [2024-06-10]. https://arxiv.org/abs/2311.09841.
[25] CHEN Y L, ZHANG Y M, YU J F, et al. In-context learning for knowledge base question answering for unmanned systems based on large language models[EB/OL]. [2024-06-10]. https://arxiv.org/abs/2311.02956.
[26] JIANG J H, ZHOU K, ZHAO W X, et al. ReasoningLM: enabling structural subgraph reasoning in pre-trained language models for question answering over knowledge graph[C]//Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, Singapore, Dec 6-10, 2023. Stroudsburg: ACL, 2023: 3721-3735.
[27] JIANG J, ZHOU K, DONG Z, et al. StructGPT: a general framework for large language model to reason over structured data[C]//Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, Singapore, Dec 6-10, 2023. Stroudsburg: ACL, 2023: 9237-9251.
[28] YE X, YAVUZ S, HASHIMOTO K, et al. RNG-KBQA: generation augmented iterative ranking for knowledge base question answering[C]//Proceedings of the 60th Annual Meet-ing of the Association for Computational Linguistics (Volume 1: Long Papers), Dublin, May 22-27, 2022. Stroudsburg: ACL, 2022: 6032-6043.
[29] HU X, WU X, SHU Y, et al. Logical form generation via multi-task learning for complex question answering over knowledge bases[C]//Proceedings of the 29th International Conference on Computational Linguistics, Gyeongju, 2022. Stroudsburg: ACL, 2022: 1687-1696.
[30] ZHANG L, ZHANG J, WANG Y, et al. FC-KBQA: a fine-to-coarse composition framework for knowledge base question answering[C]//Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Toronto, Jul 9-14, 2023. Stroudsburg: ACL, 2023: 1002-1017.
[31] CHAKRABORTY A. Multi-hop question answering over knowledge graphs using large language models[EB/OL].[2024-06-15]. https://arxiv.org/abs/2404.19234.
[32] LIU Y T, LI Z X, JIN X L, et al. An in-context schema under-standing method for knowledge base question answering[C]//Proceedings of the 17th International Conference on Knowledge Science, Engineering and Management, Birmin-gham, Aug 16-18, 2024, Cham: Springer, 2024: 419-434.
[33] AGARWAL D, DAS R, KHOSLA S, et al. Bring your own KG: self-supervised program synthesis for zero-shot KGQA[C]//Findings of the Association for Computational Lingui-stics: NAACL 2024, Mexico City, Jun 16-21, 2024. Stroud-sburg: ACL, 2024: 896-919.
[34] LI T L, MA X G, ZHUANG A, et al. Few-shot in-context learning for knowledge base question answering[EB/OL].[2024-06-15]. https://arxiv.org/abs/2305.01750.
[35] SAXENA A, TRIPATHI A, TALUKDAR P. Improving multi-hop question answering over knowledge graphs using knowledge base embeddings[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Jul 5-10, 2020. Stroudsburg: ACL, 2020: 4498-4507.
[36] LIU B, YU H, QI G. GraftNet: towards domain generalized stereo matching with a broad-spectrum and task-oriented feature[C]//Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, Jun 18-24, 2022. Piscataway: IEEE, 2022: 13012-13021.
[37] WANG C, XU Y, PENG Z, et al. Keqing: knowledge-based question answering is a nature chain-of-thought mentor of LLM[EB/OL]. [2024-06-12]. https://arxiv.org/abs/2401.00426.
[38] LIU L, HILL B, DU B, et al. Conversational question answering with reformulations over knowledge graph[C]//Findings of the Association for Computational Linguistics: ACL 2024, Bangkok, Aug 11-16, 2024. Stroudsburg: ACL, 2024: 839-850.
[39] TAN C, CHEN Y, SHAO W, et al. Make a choice! Knowledge base question answering with in-context learning[EB/OL]. [2024-06-12]. https://arxiv.org/abs/2305.13972.
[40] KIM J, KWON Y, JO Y, et al. KG-GPT: a general framework for reasoning on knowledge graphs using large language models[C]//Findings of the Association for Computational Linguistics: EMNLP 2023, Singapore, Dec 6-10, 2023. Stroud-sburg: ACL, 2023: 9410-9421.
[41] CHOUDHARY N, REDDY C K. Complex logical reasoning over knowledge graphs using large language models[EB/OL]. [2024-06-08]. https://arxiv.org/abs/2305.01157.
[42] GUO J, DU L, LIU H, et al. GPT4Graph: can large language models understand graph structured data? An empirical evaluation and benchmarking[EB/OL]. [2024-06-10]. https:// arxiv.org/abs/2305.15066.
[43] PANDA P, AGARWAL A, DEVAGUPTAPU C, et al. HOLMES: hyper-relational knowledge graphs for multi-hop question answering using LLMs[EB/OL]. [2024-07-03]. https://arxiv.org/abs/2406.06027.
[44] GAO Y F, QIAO L B, KAN Z G, et al. Two-stage generative question answering on temporal knowledge graph using large language models[EB/OL]. [2024-06-12]. https://arxiv.org/abs/2402.16568.
[45] WU Y K, HU N, BI S, et al. Retrieve-rewrite-answer: a KG-to-text enhanced LLMs framework for knowledge graph question answering[EB/OL]. [2024-06-10]. https://arxiv.org/abs/2309.11206.
[46] BAEK J, AJI A F, SAFFARI A. Knowledge-augmented language model prompting for zero-shot knowledge graph question answering[EB/OL]. [2024-06-10]. https://arxiv.org/abs/2306.04136.
[47] JIANG X, ZHANG R, XU Y, et al. HyKGE: a hypothesis knowledge graph enhanced framework for accurate and reliable medical LLMs responses[EB/OL]. [2024-06-12]. https://arxiv.org/abs/2312.15883.
[48] ZHANG Y, CHEN Z, ZHANG W, et al. Making large language models perform better in knowledge graph completion[EB/OL]. [2024-06-15]. https://arxiv.org/abs/2310.06671.
[49] MAVROMATIS C, KARYPIS G. GNN-RAG: graph neural retrieval for large language model reasoning[EB/OL]. [2024-06-15]. https://arxiv.org/abs/2405.20139.
[50] LUO L, JU J, XIONG B, et al. ChatRule: mining logical rules with large language models for knowledge graph reasoning[EB/OL]. [2024-06-16]. https://arxiv.org/abs/2309.01538.
[51] SUN J, XU C, TANG L, et al. Think-on-graph: deep and responsible reasoning of large language model with knowledge graph[EB/OL]. [2024-06-16]. https://arxiv.org/abs/2307.07697.
[52] DONG Z X, PENG B Y, WANG Y F, et al. EffiQA: effici-ent question-answering with strategic multi-model collaboration on knowledge graphs[EB/OL]. [2024-07-03]. https://arxiv.org/abs/2406.01238.
[53] FELLBAUM C. Theory and applications of ontology: computer applications[M]. Dordrecht: Springer, 2010: 231-243.
[54] BOLLACKER K, EVANS C, PARITOSH P, et al. Freebase: a collaboratively created graph database for structuring human knowledge[C]//Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data, Vancouver, Jun 10-12, 2008. New York: ACM, 2018: 1247-1250.
[55] AUER S, BIZER C, KOBILAROV G, et al. DBPedia: a nucleus for a web of open data[C]//Proceedings of the 6th International Semantic Web Conference, 2nd Asian Semantic Web Conference, Busan, Nov 11-15, 2007. Berlin, Heidelberg: Springer, 2007: 722-735.
[56] SUCHANEK F M, KASNECI G, WEIKUM G. YAGO: a core of semantic knowledge[C]//Proceedings of the 16th International Conference on World Wide Web, Banff, May 8-12, 2007. New York: ACM, 2007: 697-706.
[57] VRANDE?I? D, KR?TZSCH M. Wikidata: a free collaborative knowledgebase[J]. Communications of the ACM, 2014, 57(10): 78-85.
[58] NAVIGLI R, PONZETTO S P. BabelNet: building a very large multilingual semantic network[C]//Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, Uppsala, 2010. Stroudsburg: ACL, 2010: 216-225.
[59] LIU H, SINGH P. ConceptNet—a practical commonsense reasoning tool-kit[J]. BT Technology Journal, 2004, 22(4): 211-226.
[60] NIU X, SUN X, WANG H, et al. Zhishi.me-weaving Chinese linking open data[C]//Proceedings of the 10th International Semantic Web Conference, Bonn, Oct 23-27, 2011. Berlin, Heidelberg: Springer, 2011: 205-220.
[61] BORDES A, USUNIER N, CHOPRA S, et al. Large-scale simple question answering with memory networks[EB/OL].[2024-06-15]. https://arxiv.org/abs/1506.02075.
[62] CAI Q, YATES A. Semantic parsing freebase: towards open-domain semantic parsing[C]//Proceedings of the 2nd Joint Conference on Lexical and Computational Semantics, Atlanta, Jun 13-14, 2013. Stroudsburg: ACL, 2013: 328-338.
[63] BERANT J, CHOU A, FROSTIG R, et al. Semantic parsing on freebase from question-answer pairs[C]//Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, Seattle, Oct 18-21, 2013. Stroudsburg: ACL, 2013: 1533-1544.
[64] GU Y, KASE S, VANNI M, et al. Beyond IID: three levels of generalization for question answering on knowledge bases[C]//Proceedings of the Web Conference 2021, New York, 2021. New York: ACM, 2021: 3477-3488.
[65] SU Y, SUN H, SADLER B, et al. On generating characteristic-rich question sets for QA evaluation[C]//Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, Austin, 2016. Stroudsburg: ACL, 2016: 562-572.
[66] SERBAN I V, GARCíA-DURáN A, GULCEHRE C, et al. Generating factoid questions with recurrent neural networks: the 30m factoid question-answer corpus[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, Berlin, Aug 7-12, 2016. Stroudsburg: ACL, 2016.
[67] TALMOR A, BERANT J. The Web as a knowledge-base for answering complex questions[C]//Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, New Orleans, Jun 1-6, 2018. Stroudsburg: ACL, 2018: 641-651.
[68] YIH W, RICHARDSON M, MEEK C, et al. The value of semantic parse labeling for knowledge base question answering[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, Berlin, Aug 7-12, 2016. Stroudsburg: ACL, 2016: 201-206.
[69] YANG Y, YIH W, MEEK C. WikiQA: a challenge dataset for open-domain question answering[C]//Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, Lisbon, 2015. Stroudsburg: ACL, 2015: 2013-2018.
[70] RAJPURKAR P, ZHANG J, LOPYREV K, et al. SQuAD: 100,000+ questions for machine comprehension of text[C]//Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, Austin, Nov 1-4, 2016. Stroudsburg: ACL, 2016: 2383-2392.
[71] RAJPURKAR P, JIA R, LIANG P. Know what you don't know: unanswerable questions for SQuAD[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, Melbourne, Jul 15-20, 2018. Stroudsburg: ACL, 2018: 784-789.
[72] ZHANG Y, DAI H, KOZAREVA Z, et al. Variational reasoning for question answering with knowledge graph[C]//Proceedings of the 32nd AAAI Conference on Artificial Intelligence, New Orleans, Feb 2-7, 2018. Menlo Park: AAAI, 2018: 6069-6076.
[73] SHI J, CAO S, PAN L, et al. KQA Pro: a large diagnostic dataset for complex question answering over knowledge base[EB/OL]. [2024-06-15]. https://arxiv.org/abs/2007.03875.
[74] TRIVEDI P, MAHESHWARI G, DUBEY M, et al. LC-QuAD: a corpus for complex question answering over knowledge graphs[C]//Proceedings of the 16th International Semantic Web Conference, Vienna, Oct 21-25, 2017. Cham: Springer, 2017: 210-218.
[75] HARTMANN A K, MARX E, SORU T. Generating a large dataset for neural question answering over the DBpedia knowledge base[C]//Workshop on Linked Data Management, Co-located with the W3C WEBBR, 2018.
[76] USBECK R, NGOMO A, CONRADS F, et al. 8th challenge on question answering over linked data (QALD-8)[J]. Language, 2018, 7(1): 51-57.
[77] DUBEY M, BANERJEE D, CHAUDHURI D, et al. EARL: joint entity and relation linking for question answering over knowledge graphs[C]//Proceedings of the 17th International Semantic Web Conference, Monterey, Oct 8-12, 2018. Cham: Springer, 2018: 108-126.
[78] WANG S, OH T, WANG P. Study on accuracy metrics for evaluating the predictions of damage locations in deep piles using artificial neural networks with acoustic emission data[J]. Applied Sciences, 2021, 11(5): 2314.
[79] PARK C, SEO J, EO S, et al. A survey on evaluation metrics for machine translation[J]. Mathematics, 2023, 11(4): 1006.
[80] EMANUILOV S. Understanding recall: a fundamental metric in classification models[M]. New York: UnfoldAI, 2023.
[81] CHICCO D, JURMAN G. The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation[J]. BMC Genomics, 2020, 21: 6.
[82] CAI Z, VASCONCELOS N. Cascade R-CNN: delving into high quality object detection[C]//Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, 2018. Piscataway: IEEE, 2018: 6154-6162.
[83] LAN Y, HE S, LIU K, et al. Knowledge reasoning via jointly modeling knowledge graphs and soft rules[J]. Applied Sciences, 2023, 13(19): 10660.