计算机科学与探索 ›› 2025, Vol. 19 ›› Issue (1): 107-117.DOI: 10.3778/j.issn.1673-9418.2406009

• 大模型构建与应用 • 上一篇    下一篇

融合知识图谱和大模型的高校科研管理问答系统设计

王永,秦嘉俊,黄有锐,邓江洲   

  1. 重庆邮电大学 电子商务与现代物流重点实验室,重庆 400065
  • 出版日期:2025-01-01 发布日期:2024-12-31

Design of University Research Management Question Answering System Integrating Knowledge Graph and Large Language Models

WANG Yong, QIN Jiajun, HUANG Yourui, DENG Jiangzhou   

  1. Key Laboratory of Electronic Commerce and Logistics, Chongqing University of Posts and Telecommunications, Chongqing 400065, China
  • Online:2025-01-01 Published:2024-12-31

摘要: 科研管理是高校管理中的重要组成部分,但现有的科研管理系统难以满足用户的个性化需求。以高校科研管理向智能化转型为需求导向,将知识图谱、传统模型和大语言模型相结合,共同构建新一代高校科研管理问答系统。采集科研知识用于构建科研知识图谱。利用同时进行意图分类和实体提取的多任务模型进行语义解析。借助解析结果来生成查询语句,并从知识图谱中检索信息来回复常规问题。将大语言模型与知识图谱相结合,以辅助处理开放性问题。在意图和实体具有关联的数据集上的实验结果表明,采用的多任务模型在意图分类和实体识别任务上的F1值分别为0.958和0.937,优于其他对比模型和单任务模型。Cypher生成测试表明了自定义Prompt在激发大语言模型涌现能力方面的成效,利用大语言模型实现文本生成Cypher的准确率达到85.8%,有效处理了基于知识图谱的开放性问题。采用知识图谱、传统模型和大语言模型搭建的问答系统的准确性为0.935,很好地满足了智能问答的需求。

关键词: 知识图谱, 多任务模型, 意图分类, 命名实体识别, 大语言模型

Abstract: Scientific research management is a crucial aspect of university management. However, existing scientific research management systems cannot meet the individual needs of users. This paper focuses on transforming university scientific research management towards intelligence as the demand orientation, and combines knowledge graph, traditional model and large language models to jointly build a new university scientific research management question answering system. Firstly, the scientific research knowledge is collected to build a scientific research knowledge graph. Then, a multi-task model is used for semantic parsing, simultaneously performing intent classification and entity extraction. Finally, the parsing results are used to generate query statements to retrieve information from the knowledge graph and answer general questions. Additionally, large language models are combined with knowledge graph to assist in processing open problems. Experimental results on datasets with associated intents and entities show that the F1 values of the adopted multi-task model in intent classification and entity recognition tasks are 0.958 and 0.937, respectively, surpassing other comparison models and single-task models. The Cypher generation test demonstrates the effectiveness of the custom Prompt in stimulating the emergent abilities of large language models. The accuracy of text-generated Cyphers using large language models reaches 85.8%, effectively handling open questions based on knowledge graph. The accuracy of the question answering system built with knowledge graph, traditional model and large language models is 0.935, which well meets the needs of intelligent question and answer.

Key words: knowledge graph, multi-task model, intent classification, named entity recognition, large language models