Journal of Frontiers of Computer Science and Technology

• Science Researches •     Next Articles

SbSER: A Step-by-Step Enhanced Reasoning Framework for Large Language Model With External Subgraph Generation

FENG Tuoyu, WANG Gangliang, QIAO Zijian, LI Weiping, ZHANG Yusong,  GUO Qinglang   

  1. 1. School of Software & Microelectronics, Peking University, Beijing 100091, China
    2. China Academic of Electronics and Information Technology, Beijing 100041, China
    3. School of Telecommunications Engineering, Xidian University, Xi’an 710126, China

SbSER:基于外部子图生成的大语言模型分步增强推理框架

冯拓宇,王刚亮,乔子剑,李伟平,张雨松,郭庆浪   

  1. 1.北京大学 软件与微电子学院,北京 100091
    2.中国电子科技集团有限公司电子科学研究院,北京 100041
    3.西安电子科技大学 通信工程学院 西安 710126

Abstract: Large Language Models (LLMs) have achieved significant success across various tasks, particularly in machine translation, text generation, and question-answering systems since their inception. Their applications have rapidly expanded to more complex tasks. However, despite their impressive performance in many areas, LLMs still face significant challenges in tasks that require deep reasoning and logical deduction. This is mainly due to the fact that during training, LLMs rely heavily on large volumes of textual data, which often fail to comprehensively cover specialized knowledge across all domains. As a result, LLMs tend to generate "hallucinations" in handling domain-specific problems, meaning they output inaccurate or factually incorrect answers. This issue can be mitigated by incorporating external knowledge graphs (KG) to assist in the reasoning process of LLM. We present the SbSER, a step-by-step enhanced reasoning framework for LLM with external subgraph generation. First, it guides the LLM to do accurate semantic parsing through generating clear subgraph Schemas, converting questions into logical retrieval statements. Second, it imports knowledge triples into a graph database to complete precise knowledge retrieval. Finally, it achieves the final enhanced reasoning answer through the combination of two reasoning methods: direct retrieval reasoning and joint retrieval reasoning. Experimental results demonstrate that SbSER has made significant progress across multiple datasets. Based on this success, this study aims to provide valuable references for future research on the integration of KG and LLM, thereby enhancing LLM’s capabilities in solving complex problems.

Key words: large language model, subgraph generation, step-by-step reasoning

摘要: 大语言模型(Large Language Model, LLM)自问世以来在各种任务中取得了显著的成功,尤其是在机器翻译、文本生成、问答系统等任务中的卓越表现,它们的应用也迅速扩展到了更多复杂的任务中。然而,尽管LLM在多种任务中展现了强大的能力,但在需要深入推理和逻辑推导的任务场景中,它们仍然面临着显著的挑战。尤其是,由于模型训练过程中依赖大量的文本数据,往往难以全面涵盖所有领域的专业知识,导致LLM在处理特定领域问题时容易产生“幻觉”问题,即输出不准确或与实际知识不符的答案。该问题可以通过在大语言模型推理中引入外部知识图谱(Knowledge Graph, KG)来辅助解决。本研究创新性地提出基于外部子图生成的大模型分步增强推理框架(Step-by-Step Enhanced Reasoning Framework for Large Language Model With External Subgraph Generation, SbSER),首先通过生成清晰的子图Schema引导大模型完成准确的语义解析以完成问题到逻辑查询语句的转换,其次将知识三元组导入图数据库中以完成准确的知识查询,最后通过采用直接查询推理和联合推理两种推理方式实现问题的最终增强推理输出。实验结果表明,SbSER在多个数据集上取得显著进展。基于这一成功,本研究希望为未来KG与LLM融合的研究提供有价值的参考,从而提升LLM在解决复杂问题上的能力。

关键词: 大语言模型, 子图生成, 分步推理