计算机科学与探索 ›› 2024, Vol. 18 ›› Issue (9): 2361-2369.DOI: 10.3778/j.issn.1673-9418.2406067

• 垂直领域大模型构建与应用专题 • 上一篇    下一篇

面向工艺规范文本的大语言模型知识注入方法研究

纪贵阳,王裴岩,余卓   

  1. 1. 沈阳航空航天大学 计算机学院,沈阳 110136
    2. 上海飞机制造有限公司 航空制造技术研究所,上海 201324
  • 出版日期:2024-09-01 发布日期:2024-09-01

Research on Knowledge Injection Method for Large Language Model Oriented to Process Specification Texts

JI Guiyang, WANG Peiyan, YU Zhuo   

  1. 1. School of Computer Science, Shenyang Aerospace University, Shenyang 110136, China
    2. Aviation Manufacturing Technology Research Institute, COMAC Shanghai Aircraft Manufacturing Co., Ltd., Shanghai 201324, China
  • Online:2024-09-01 Published:2024-09-01

摘要: 使用大语言模型进行工艺规范的应用是解决工艺知识查询不准确的有效途径。现阶段通过领域知识图谱嵌入或指令数据微调的领域模型构建方法效果不佳,难点在于工艺规范中工艺知识涉及多种工艺要素间关系,复杂度较高。各规范间仅通过引文方式使用导致数据稀疏。工艺知识复杂度高及数据稀疏导致模型对工艺领域概念、概念与属性间关系、概念与概念间关系、多概念间关系及参考依据知识的学习受限。针对该难点,提出一种面向工艺规范文本的大语言模型知识注入方法。根据工艺规范数据特点设计了包含辅助句判别任务、概念-篇章生成任务、篇章续写任务及篇章-摘要生成任务的知识注入数据,结合问答对数据对模型进行有监督微调,为模型注入领域概念、属性、多概念间关系及参考依据知识。实验结果表明,结合知识注入数据和问答对数据训练的模型对比只使用问答对数据训练的模型ACC(准确率)提升7.3个百分点,ROUGE-L提升7.4个百分点,BLEU-4提升6.2个百分点,表明提出的知识注入方法的有效性。

关键词: 工艺规范, 大语言模型, 知识注入, 有监督微调

Abstract: The application of large language models in process specifications is an effective approach to addressing the issue of inaccurate process knowledge queries. At present, the domain model construction methods through domain knowledge graph embedding or fine-tuning with instruction data are not effective. The difficulty lies in the fact that the process knowledge in the process specifications involves relationships between multiple process elements, which is highly complex. The data are sparse because the standards are only used through citation. The high complexity of process knowledge and sparse data limit the model’s ability to learn process domain concepts, the relationships between concepts and attributes, the relationships between concepts, the relationships between multiple concepts, and reference-based knowledge. To address this difficulty, this paper proposes a large language model knowledge injection method for process specification texts. According to the characteristics of process specification data, this paper designs knowledge injection data including auxiliary sentence identification task, concept-chapter generation task, chapter continuation task and chapter-summary generation task. The model is fine-tuned through supervised learning by combining question-answer pair data to inject domain concepts, attributes, relationships between multiple concepts, and reference knowledge into the model. Experimental results show that the model trained with knowledge injection data and question-answer pair data improves ACC (accuracy) by 7.3 percentage points, ROUGE-L by 7.4 percentage points, and BLEU-4 by 6.2 percentage points compared with the model trained only with question-answer pair data, indicating the effectiveness of the proposed knowledge injection method.

Key words: process specification, large language model, knowledge injection, supervised fine-tuning