计算机科学与探索 ›› 2023, Vol. 17 ›› Issue (10): 2389-2402.DOI: 10.3778/j.issn.1673-9418.2307061

• 大语言模型与知识图谱专题 • 上一篇    下一篇

面向医疗问答系统的大语言模型命名实体识别方法

杨波,孙晓虎,党佳怡,赵海燕,金芝   

  1. 1. 北京林业大学 信息学院,北京 100083
    2. 国家林业和草原局林业智能信息处理工程技术研究中心,北京 100083
    3. 北京大学 计算机学院,北京 100871
    4. 高可信软件技术教育部重点实验室(北京大学),北京 100871
  • 出版日期:2023-10-01 发布日期:2023-10-01

Named Entity Recognition Method of Large Language Model for Medical Question Answering System

YANG Bo, SUN Xiaohu, DANG Jiayi, ZHAO Haiyan, JIN Zhi   

  1. 1. School of Information Science and Technology, Beijing Forestry University, Beijing 100083, China
    2. Engineering Research Center for Forestry-Oriented Intelligent Information Processing, National Forestry and Grassland Administration, Beijing 100083, China
    3. School of Computer Science, Peking University, Beijing 100871, China
    4. Key Laboratory of High Confidence Software Technologies (Peking University), Ministry of Education, Beijing 100871, China
  • Online:2023-10-01 Published:2023-10-01

摘要: 在医疗问答系统中,实体识别发挥了重大作用。随着深度学习的发展,基于深度学习的实体识别得到了越来越多的关注。但是,在医疗问答系统中,由于缺少带标注的训练数据,深度学习方法不能够很好地识别医疗文本中的非连续实体和嵌套实体。为此,提出了一种基于大语言模型的实体识别应用方法,并且将其应用到医疗问题系统中。首先将医疗问答相关的数据集进行处理,变成大语言模型能够分析和处理的文本;其次针对大语言模型的输出进行分类,并对不同的分类采取相应的处理;然后将输入的文本进行意图识别,最终将实体识别和意图识别的结果发送到医疗知识图谱中进行查询,得到医疗问答的答案。在3个典型的数据集上进行了实验,并与几种典型的相关方法进行了对比。结果显示所提出的方法表现效果更好。

关键词: 大语言模型, 实体识别, 意图识别, 医疗问答系统

Abstract: In medical question answering systems, entity recognition plays a major role. Entity recognition based on deep learning has received more and more attention. However, in the medical question answering system, due to the lack of annotated training data, deep learning methods cannot well identify discontinuous and nested entities in medical text. Therefore, a large language model-based entity recognition application method is proposed, and it is applied to the medical problem system. Firstly, the dataset related to medical question answering is processed into text that can be analyzed and processed by a large language model. Secondly, the output of the large language model is classified, and different classifications are processed accordingly. Then, the input text is used for intent recognition, and  finally the results of entity recognition and intent recognition are sent to the medical knowledge graph for query, and the answer to the medical question and answer is obtained. Experiments are performed on 3 typical datasets and compared with several typical correlation methods. The results show that the method proposed in this paper performs better.

Key words: large language models, entity recognition, intent recognition, medical question and answer system