计算机科学与探索 ›› 2023, Vol. 17 ›› Issue (5): 1139-1146.DOI: 10.3778/j.issn.1673-9418.2106063

• 人工智能·模式识别 • 上一篇    下一篇

利用GAN和特征金字塔的模型鲁棒性优化方法

孙家泽,唐彦梅,王曙燕   

  1. 西安邮电大学 计算机学院,西安 710121
  • 出版日期:2023-05-01 发布日期:2023-05-01

Model Robustness Optimization Method Using GAN and Feature Pyramid

SUN Jiaze+, TANG Yanmei, WANG Shuyan   

  1. School of Computer Science & Technology, Xi’an University of Posts & Telecommunications, Xi’an 710121, China
  • Online:2023-05-01 Published:2023-05-01

摘要: 人工智能对抗环境下,深度神经网络对于对抗样本有明显的脆弱性,为提高对抗环境下的模型鲁棒性提出一种深度神经网络模型鲁棒性优化方法AdvRob。首先将目标模型改造为特征金字塔结构,然后利用潜在特征先验知识生成攻击力更强的对抗样本进行对抗训练。在MNIST和CIFAR-10数据集上进行的实验表明,利用潜在特征生成的对抗样本相较于AdvGAN方法攻击成功率高,更具多样性且可迁移性强;在高扰动下,MNIST数据集上AdvRob模型相比原模型对FGSM和JSMA攻击的防御能力提升了至少4倍,对PGD、BIM、C&W攻击的防御能力提升了至少10倍;CIFAR-10数据集上AdvRob模型对FGSM、PGD、C&W、BIM和JSMA攻击的防御能力相较于原模型提升了至少5倍,防御效果明显。在SVHN数据集上,与FGSM对抗训练、PGD对抗训练、防御性蒸馏和增加外部模块的模型鲁棒性优化方法相比,AdvRob方法对白盒攻击的防御效果最显著。为对抗环境下DNN模型提供了一个高效的鲁棒性优化方法。

关键词: 生成对抗网络(GAN), 深度神经网络, 对抗样本, 特征金字塔, 模型鲁棒性

Abstract: Under the artificial intelligence adversarial environment, deep neural networks have an obvious vulnerability to adversarial samples. To improve the robustness of the model in the adversarial environment, a deep neural network model robustness optimization method AdvRob is proposed. Firstly, the target model is transformed into a feature pyramid structure, and then the prior knowledge of latent features is used to generate more aggressive adversarial samples for adversarial training. Experiments on the MNIST and CIFAR-10 datasets show that the adversarial samples generated by using latent features have a higher attack success rate, more diversity and stronger transferability than the AdvGAN method. Under high disturbances, on the MNIST dataset, compared with original model, the defensive ability of the AdvRob method against FGSM and JSMA attacks has been improved by at least 4 times, and the defensive ability against PGD, BIM, and C&W attacks has been improved by at least 10 times. Compared with  original model, the defensive ability against FGSM, PGD, C&W, BIM and JSMA attacks is improved by at least 5 times, and the defensive effect is obvious on the CIFAR-10 dataset. On the SVHN dataset, compared with FGSM adversarial training, PGD adversarial training, defensive distillation, and model robustness optimization methods adding external modules, the AdvRob method has the most significant defensive effect against white-box attacks. It provides an efficient and robust optimization method for the DNN model in the adversarial environment.

Key words: generative adversarial network (GAN), deep neural networks, adversarial sample, feature pyramid, model robustness