Journal of Frontiers of Computer Science and Technology ›› 2023, Vol. 17 ›› Issue (5): 1038-1048.DOI: 10.3778/j.issn.1673-9418.2210061

• Frontiers·Surveys • Previous Articles     Next Articles

Survey on Backdoor Attacks and Countermeasures in Deep Neural Network

QIAN Hanwei, SUN Weisong   

  1. 1. Department of Computer Information and Cybersecurity, Jiangsu Police Institute, Nanjing 210031, China
    2. Software Institute, Nanjing University, Nanjing 210093, China
  • Online:2023-05-01 Published:2023-05-01

深度神经网络中的后门攻击与防御技术综述

钱汉伟,孙伟松   

  1. 1. 江苏警官学院 计算机信息与网络安全系,南京 210031
    2. 南京大学 软件学院,南京 210093

Abstract: The neural network backdoor attack aims to implant a hidden backdoor into the deep neural network, so that the infected model behaves normally on benign test samples, but behaves abnormally on poisoned test samples with backdoor triggers. For example, all poisoned test samples will be predicted as the target label by the infected model. This paper provides a comprehensive review and the taxonomy for existing attack methods according to the attack objects, which can be categorized into four types, including data poisoning attacks, physical world attacks, model poisoning attacks, and others. This paper summarizes the existing backdoor defense technologies from the perspective of attack and defense confrontation, which include poisoned sample identifying, poisoned model identifying, poisoned test sample filtering, and others. This paper explains the principles of deep neural network backdoor defects from the perspectives of deep learning mathematical principles and visualization, and discusses the difficulties and future development directions of deep neural network backdoor attacks and countermeasures from the perspectives of software engineering and program analysis. It is hoped that this survey can help researchers understand the research progress of deep neural network backdoor attacks and countermeasures, and provide more inspiration for designing more robust deep neural networks.

Key words: deep neural network, backdoor attack, backdoor countermeasures, trigger

摘要: 神经网络后门攻击旨在将隐藏的后门植入到深度神经网络中,使被攻击的模型在良性测试样本上表现正常,而在带有后门触发器的有毒测试样本上表现异常,如将有毒测试样本的类别预测为攻击者的目标类。对现有攻击和防御方法进行全面的回顾,以攻击对象作为主要分类依据,将攻击方法分为数据中毒攻击、物理世界攻击、中毒模型攻击和其他攻击等类别。从攻防对抗的角度对现有后门攻击和防御的技术进行归纳总结,将防御方法分为识别有毒数据、识别中毒模型、过滤攻击数据等类别。从深度学习几何原理、可视化等角度探讨深度神经网络后门缺陷产生的原因,从软件工程、程序分析等角度探讨深度神经网络后门攻击和防御的困难以及未来发展方向。希望为研究者了解深度神经网络后门攻击与防御的研究进展提供帮助,为设计更健壮的深度神经网络提供更多启发。

关键词: 深度神经网络, 后门攻击, 后门防御, 触发器