计算机科学与探索

• 学术研究 •    下一篇

基于大语言模型的虚假信息检测框架综述

张欣,孙靖超   

  1. 1.中国人民公安大学侦查学院,北京  100038
    2.中国人民公安大学国家安全学院,北京  100038

A review of false information detection frameworks based on large language models

ZHANG Xin,  SUN Jingchao   

  1. 1.School of Investigation, People's Public Security University of China, Beijing 100038
    2.School of National Security, People's Public Security University of China, Beijing 100038

摘要: 在全球范围内,虚假信息于互联网尤其是社交媒体中的传播,已成为亟待解决的重要议题。随着人工智能技术的兴起,虚假信息检测中大语言模型的应用研究已然成为热点。但在国内,该领域相关研究较为匮乏,尚未形成完整体系。为系统梳理其研究现状及发展脉络,对大语言模型赋能虚假信息检测的研究进行了全面总结,是国内第一篇大语言模型应用于此领域的综述。本文聚焦于基于大语言模型的虚假信息检测框架,深入探讨了大语言模型在虚假信息检测过程中数据生成、数据增强、信息抽取、结合外部知识和工具、模型改进、最终融合决策、解释与反馈生成等方面的创新应用。概述了虚假信息的定义及其传播的背景,详细剖析了框架中的核心检测过程,梳理了虚假信息检测框架中各环节的创新点,对“内部”与“外部”的检测流程进行概述,并阐述了检测过程中涉及的检索增强、提示工程、微调等模型改进与最终决策。最后,分析了当前基于大语言模型的虚假信息检测面临的挑战,并对未来的研究方向进行了展望,以期为基于大语言模型的虚假信息检测领域的发展提供借鉴与启示。

关键词: 大语言模型, 虚假信息检测, 数据增强, 关键信息抽取

Abstract: Globally, the spread of false information on the Internet, especially on social media, has become an urgent issue to be addressed. With the rise of artificial intelligence technology, the application research of large language models in false information detection has become a hot topic. However, in China, related research in this field is relatively scarce and has not yet formed a complete system. To systematically review the current research status and development trends, this paper provides a comprehensive summary of the application of large language models in false information detection, which is the first review in China on this topic. This paper focuses on the false information detection framework based on large language models and deeply explores the innovative applications of large language models in data generation, data augmentation, information extraction, integration with external knowledge and tools, model improvement, final fusion decision-making, explanation and feedback generation during the false information detection process. It outlines the definition of false information and the background of its spread, elaborates on the core detection process in the framework, sorts out the innovation points in each link of the false information detection framework, summarizes the "internal" and "external" detection processes, and expounds on the model improvements such as retrieval enhancement, prompt engineering, fine-tuning, and final decision-making involved in the detection process. Finally, it analyzes the challenges faced by false information detection based on large language models at present and looks forward to future research directions, with the aim of providing references and inspirations for the development of false information detection based on large language models.

Key words: Large Language Models, False Information Detection, Data Augmentation, Key Information Extraction