计算机科学与探索

• 学术研究 •    下一篇

融合前缀调优和提示学习的仇恨言论检测方法

徐磊,胡亚豪,陈满,陈军,潘志松   

  1. 中国人民解放军陆军工程大学 指挥控制工程学院, 南京 210000

Hate Speech Detection Method Integrating Prefix Tuning and Prompt Learning

XU Lei, HU Yahao, CHEN Man, CHEN Jun,x PAN Zhisong   

  1. Command and Control Engineering College, Army Engineering University of PLA, Nanjing 210000, China

摘要: 随着网络社交平台的发展,网络暴力的危害性日益凸显,其中仇恨言论作为网络暴力的一种表现形式,对其检测方法的研究有助于构建一个健康的互联网环境。然而,一方面先前的仇恨言论检测方法主要依赖于人工监管和关键词过滤,这些方法存在人工成本高且分类效果不佳的问题;另一方面,在互联网平台的监管下,仇恨言论的表达也变得更加含蓄、隐蔽,这对检测模型的文本理解能力提出了更高的要求。以ChatGPT为代表的大模型为仇恨言论检测任务提供了新的可能性,受其在各项下游任务中出色表现的启发,提出了一种融合前缀调优和提示学习的仇恨言论检测方法P-Prompt。具体来说,首先利用前缀调优方法将大模型在相关数据集上进行微调,同时结合提示学习方法使模型能够识别并关注待检测文本中的仇恨言论词汇,从而进一步提升模型对仇恨言论的识别能力。实验结果验证了大模型在仇恨言论检测任务的有效性,同时表明,与传统方法相比P-Prompt方法在仇恨言论检测的二分类和多分类任务中各项评价指标都取得了较明显的提升。

关键词: 仇恨言论检测, 大语言模型, 前缀调优, 提示学习

Abstract: With the development of online social platforms, the harm of cyberbullying has become increasingly prominent, with hate speech being one of its manifestations. Researching detection methods for hate speech is conducive to building a healthy internet environment. However, on one hand, previous hate speech detection methods have relied heavily on manual supervision and keyword filtering, which are not only costly in terms of human resources but also lack precision in classification; on the other hand, under the regulation of internet platforms, the expression of hate speech has become more subtle and covert, posing higher demands on the text comprehension capabilities of detection models. Large models like ChatGPT have opened up new possibilities for hate speech detection tasks. Inspired by their exceptional performance in various downstream tasks, a hate speech detection method named P-Prompt, which integrates prefix tuning and prompt learning, has been proposed. Specifically, it first uses the prefix tuning method to fine-tune the large model on relevant datasets, while also incorporating the prompt learning method to enable the model to recognize and focus on hate speech vocabulary in the text to be detected, thereby further enhancing the model's ability to identify hate speech. Experimental results validate the effectiveness of large models in hate speech detection tasks and show that, compared to traditional methods, the P-Prompt method has achieved significant improvements in both binary and multi-class classification tasks of hate speech detection across various evaluation metrics.

Key words: Hate Speech Detection, Large Language Models, Prefix Tuning, Prompt Learning