计算机科学与探索

• 学术研究 •    下一篇

基于大语言模型的实体关系抽取综述

夏江镧, 李艳玲, 葛凤培   

  1. 1. 内蒙古师范大学 计算机科学技术学院, 呼和浩特 010022
    2. 无穷维哈密顿系统及其算法应用教育部重点实验室(内蒙古师范大学), 呼和浩特 010022
    3. 北京邮电大学 图书馆, 北京 100876

A Survey of Entity Relation Extraction Based on Large Language Models

XIA Jianglan,  LI Yanling,  GE Fengpei   

  1. 1. College of Computer Science and Technology, Inner Mongolia Normal University, Hohhot 010022, China
    2. Key Laboratory of Infinite-dimensional Hamiltonian System and Algorithm Application (Inner Mongolia Normal University), Ministry of Education, Hohhot 010022, China
    3. Beijing University of Posts and Telecommunications, Library, Beijing 100876, China

摘要: 实体关系抽取任务旨在从非结构化文本中识别实体对及其相互关系,是众多自然语言处理下游任务应用的基础。随着大数据和深度学习技术的发展,实体关系抽取的研究取得了显著进展。近年来,将大语言模型应用于实体关系抽取任务已成为新的研究趋势。大语言模型具备自动特征提取和强大的泛化能力,能够显著提升任务性能。本文对实体关系抽取的方法进行综述,并根据所使用的方法和模型的演变将其划分为两大类。首先,本文介绍了命名实体识别和关系抽取任务的定义。其次,系统回顾了实体关系抽取方法的发展历程,并对其相应模型的优缺点进行了深入分析。在此基础上,重点探讨了基于大语言模型的方法在解决实体关系抽取任务中的独特优势。然后,整理了当前主流数据集的特点,并总结了实体关系抽取任务的常用评价指标,如精确率、召回率和F1值等。最后,分析了当前研究中存在的挑战并对未来研究方向进行了展望。

关键词: 大语言模型, 实体关系抽取, 命名实体识别

Abstract: Entity relation extraction aims to identify entity pairs and their relationships from unstructured text, serving as the foundation for many downstream tasks in natural language processing. With the development of big data and deep learning technologies, significant progress has been made in entity relation extraction research. In recent years, applying large language models to this task has become a new research trend. Large language models, with their ability to automatically extract features and strong generalization capabilities, can significantly enhance the performance of the task. This paper provides a comprehensive review of entity relation extraction methods, categorizing them into two main types based on the evolution of techniques and models. First, the definitions of named entity recognition and relation extraction tasks are introduced. Next, a systematic review of the development of entity relation extraction methods is presented, with an in-depth analysis of the advantages and disadvantages of the corresponding models. Building on this, the paper focuses on the unique advantages of large language model-based methods in addressing entity relation extraction tasks. Furthermore, the characteristics of current mainstream datasets are summarized, along with common evaluation metrics for entity relation extraction, such as precision, recall, and F1 score. Finally, the challenges in current research are analyzed, and future research directions are discussed.

Key words: large language models, entity relationship extraction, named entity recognition