融合前缀调优和提示学习的仇恨言论检测方法

doi:10.3778/j.issn.1673-9418.2405026

摘要/Abstract

摘要： 随着网络社交平台的发展，网络暴力的危害性日益凸显，其中仇恨言论作为网络暴力的一种表现形式，对其检测方法的研究有助于构建一个健康的互联网环境。然而，一方面先前的仇恨言论检测方法主要依赖于人工监管和关键词过滤，这些方法存在人工成本高且分类效果不佳的问题；另一方面，在互联网平台的监管下，仇恨言论的表达也变得更加含蓄、隐蔽，这对检测模型的文本理解能力提出了更高的要求。以ChatGPT为代表的大模型为仇恨言论检测任务提供了新的可能性，受其在各项下游任务中出色表现的启发，提出了一种融合前缀调优和提示学习的仇恨言论检测方法P-Prompt。具体来说，首先利用前缀调优方法将大模型在相关数据集上进行微调，同时结合提示学习方法使模型能够识别并关注待检测文本中的仇恨言论词汇，从而进一步提升模型对仇恨言论的识别能力。实验结果验证了大模型在仇恨言论检测任务的有效性，同时表明，与传统方法相比P-Prompt方法在仇恨言论检测的二分类和多分类任务中各项评价指标都取得了较明显的提升。

关键词: 仇恨言论检测, 大语言模型, 前缀调优, 提示学习

Abstract: With the development of online social platforms, the harm of cyberbullying has become increasingly prominent, with hate speech being one of its manifestations. Researching detection methods for hate speech is conducive to building a healthy internet environment. However, on one hand, previous hate speech detection methods have relied heavily on manual supervision and keyword filtering, which are not only costly in terms of human resources but also lack precision in classification; on the other hand, under the regulation of internet platforms, the expression of hate speech has become more subtle and covert, posing higher demands on the text comprehension capabilities of detection models. Large models like ChatGPT have opened up new possibilities for hate speech detection tasks. Inspired by their exceptional performance in various downstream tasks, a hate speech detection method named P-Prompt, which integrates prefix tuning and prompt learning, has been proposed. Specifically, it first uses the prefix tuning method to fine-tune the large model on relevant datasets, while also incorporating the prompt learning method to enable the model to recognize and focus on hate speech vocabulary in the text to be detected, thereby further enhancing the model's ability to identify hate speech. Experimental results validate the effectiveness of large models in hate speech detection tasks and show that, compared to traditional methods, the P-Prompt method has achieved significant improvements in both binary and multi-class classification tasks of hate speech detection across various evaluation metrics.

Key words: Hate Speech Detection, Large Language Models, Prefix Tuning, Prompt Learning

徐磊, 胡亚豪, 陈满, 陈军, 潘志松. 融合前缀调优和提示学习的仇恨言论检测方法[J]. 计算机科学与探索, DOI: 10.3778/j.issn.1673-9418.2405026.

XU Lei, HU Yahao, CHEN Man, CHEN Jun, x PAN Zhisong. Hate Speech Detection Method Integrating Prefix Tuning and Prompt Learning[J]. Journal of Frontiers of Computer Science and Technology, DOI: 10.3778/j.issn.1673-9418.2405026.

[1]	骆仕杰, 金日泽, 韩抒真. 采用低秩编码优化大语言模型的高校基础知识问答研究[J]. 计算机科学与探索, 2024, 18(8): 2156-2168.
[2]	盛蕾, 陈希亮, 赖俊. 基于潜在状态分布GPT的离线多智能体强化学习方法[J]. 计算机科学与探索, 2024, 18(8): 2169-2179.
[3]	吕海啸, 李益红, 周晓谊. 前缀调优的少样本命名实体识别[J]. 计算机科学与探索, 2024, 18(8): 2180-2189.
[4]	张琪, 钟昊. 大语言模型驱动的知识图谱实体摘要的次模优化方法[J]. 计算机科学与探索, 2024, 18(7): 1806-1813.
[5]	冯钧, 畅阳红, 陆佳民, 唐海麟, 吕志鹏, 邱钰淳. 基于大语言模型的水工程调度知识图谱的构建与应用[J]. 计算机科学与探索, 2024, 18(6): 1637-1647.
[6]	李源, 马新宇, 杨国利, 赵会群, 宋威. 面向知识图谱和大语言模型的因果关系推断综述[J]. 计算机科学与探索, 2023, 17(10): 2358-2376.
[7]	杨波, 孙晓虎, 党佳怡, 赵海燕, 金芝. 面向医疗问答系统的大语言模型命名实体识别方法[J]. 计算机科学与探索, 2023, 17(10): 2389-2402.
[8]	张鹤译, 王鑫, 韩立帆, 李钊, 陈子睿, 陈哲. 大语言模型融合知识图谱的问答系统研究[J]. 计算机科学与探索, 2023, 17(10): 2377-2388.

融合前缀调优和提示学习的仇恨言论检测方法

Hate Speech Detection Method Integrating Prefix Tuning and Prompt Learning

PDF

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 8

编辑推荐

Metrics