GAN图像对抗样本生成方法

doi:10.3778/j.issn.1673-9418.2005022

摘要/Abstract

摘要：

为了提高生成对抗网络模型对抗样本的多样性和攻击成功率，提出了一种GAN图像对抗样本生成方法。首先，利用原始样本集整体训练一个深度卷积对抗生成网络G1，模拟原始样本集分布；其次，在黑盒攻击场景下，利用模型蒸馏方法对目标模型进行黑盒复制，获取目标模型的本地复制；然后以G1的输出作为输入，以蒸馏模型作为目标模型，训练生成对抗网络G2，在有目标攻击情况下还需输入目标类别，G2用以生成输入数据针对目标类别的扰动；最后将样本与扰动相加并以像素灰度值区间进行规范化，得到对抗样本。实验结果表明，在相同输入条件下该方法产生图像对抗样本平均SSIM指标、MI指标和Cosin相似度分别降低50.7%、10.96%和28.7%，平均均方误差值（MSE）和图像指纹的海明距离分别提升7.6%和1 974.80，同时MNIST数据集和CIFAR10数据集下模型平均攻击成功率在95%以上。

关键词: 神经网络, 对抗样本, 生成对抗网络（GAN）, 模型蒸馏, 图像多样性

Abstract:

In order to improve the diversity of adversarial samples and the success rate of attacks, a GAN image adversarial sample generation method is proposed. Firstly, the original sample set is used to train a deep convolutional generative adversarial network G1 to simulate the distribution of the original sample set. Secondly, in the black box attack scenario, the model distillation method is used to copy the target model in black box to obtain the local copy of the target model. Then the output of G1 is taken as input and the distillation model as the target model to train the generative adversarial network G2. In the case of target attack, the target category is also needed to be entered. G2 is used to generate the disturbance of the input data against the target category. Finally, the sample and the disturbance are added and the pixel gray value interval is normalized to obtain the adversarial sample. Experimental results show that under the same input conditions, the average of SSIM index, MI index and Cosin similarity of the image generated by this method are reduced by 50.7%, 10.96% and 28.7% respectively, the average MSE (mean square error) value and Hamming distance of fingerprint are increased by 7.6% and 1974.80 respectively, and the average attack success rate of the model under the MNIST dataset and the CIFAR10 dataset is above 95%.

Key words: neural networks, adversarial sample, generative adversarial network (GAN), model distillation, image diversity

王曙燕, 金航, 孙家泽. GAN图像对抗样本生成方法[J]. 计算机科学与探索, 2021, 15(4): 702-711.

WANG Shuyan, JIN Hang, SUN Jiaze. Method for Image Adversarial Samples Generating Based on GAN[J]. Journal of Frontiers of Computer Science and Technology, 2021, 15(4): 702-711.

参考文献

[1] JIAO L C, YANG S Y, HAN J W. Thoughs and prospects of braininspired intelligence[J]. Bulletin of National Natural Science Foundation of China, 2019, 33(6): 646-650.
焦李成, 杨淑媛, 韩军伟. 类脑智能与深度学习的几个问题与思考[J]. 中国科学基金, 2019, 33(6): 646-650.
[2] PAN W W, WANG X Y, SONG M L, et al. Survey on gen-erating adversarial examples[J]. Journal of Software, 2020, 31(1): 67-81.
潘文雯, 王新宇, 宋明黎, 等. 对抗样本生成技术综述[J]. 软件学报, 2020, 31(1): 67-81.
[3] KAHN G, VILLAFLOR A, DING B, et al. Self-supervised deep reinforcement learning with generalized computation graphs for robot navigation[C]//Proceedings of the 2018 IEEE International Conference on Robotics and Automation, Bris-bane, May 21-25, 2018. Piscataway: IEEE, 2018: 5129-5136.
[4] LIU C, CAO Y, LUO Y, et al. Deepfood: deep learning-based food image recognition for computer-aided dietary assess-ment[C]//LNCS 9677: Proceedings of the 14th Interna-tional Conference on Smart Homes and Health Telematics, Wuhan, May 25-27, 2016. Berlin, Heidelberg: Springer, 2016: 37-48.
[5] SZEGEDY C, ZAREMBA W, SUTSKEVER I, et al. Intriguing properties of neural networks[J]. arXiv:1312.6199, 2013.
[6] GAVRILESCU M, VIZIREANU N. Predicting the sixteen personality factors (16PF) of an individual by analyzing facial features[J]. EURASIP Journal on Image and Video Processing, 2017(1): 59.
[7] ZHANG L. Face gender recognition research based on local features and support vector machine[J]. Applied Mechanics & Materials, 2014, 687-691: 3714-3717.
[8] GOODFELLOW I J, SHLENS J, SZEGEDY C. Explaining and harnessing adversarial examples[J]. arXiv:1412.6572, 2014.
[9] PAPERNOT N, MCDANIEL P, JHA S, et al. The limitations of deep learning in adversarial settings[C]//Proceedings of the 2016 IEEE European Symposium on Security and Privacy, Saarbrücken, Mar 21-24, 2016. Piscataway: IEEE, 2016:372-387.
[10] CARLINI N, WAGNER D. Towards evaluating the robu-stness of neural networks[C]//Proceedings of the 2017 IEEE Symposium on Security and Privacy, San Jose, May 22-26, 2017. Washington: IEEE Computer Society, 2017: 39-57.
[11] NIDHRA S, DONDETI J. Black box and white box testing techniques—a literature review[J]. International Journal of Embedded Systems and Applications, 2012, 2(2): 29-50.
[12] Goodfellow I J, POUGET-ABADIE J, MIRZA M, et al. Generative adversarial nets[C]//Proceedings of the 27th Inter-national Conference on Neural Information Processing Systems, Montreal, Dec 8-13, 2014. Red Hook: Curran Associates, 2014: 2672-2680.
[13] MIRZA M, OSINDERO S. Conditional generative adver-sarial nets[J]. arXiv:1411.1784, 2014.
[14] DENTON E L, CHINTALA S, FERGUS R. Deep generative image models using a Laplacian pyramid of adversarial networks[C]//Proceedings of the Annual Conference on Neural Information Processing Systems, Montreal, Dec 7-12, 2015. Red Hook: Curran Associates, 2015: 1486-1494.
[15] OLKKONEN H, PESOLA P. Gaussian pyramid wavelet transform for multiresolution analysis of images[J]. Graphical Models and Image Processing, 1996, 58(4): 394-398.
[16] BURT P J, ADELSON E H. The Laplacian pyramid as a compact image code[J]. Readings in Computer Vision, 1987, 31(4): 671-679.
[17] RADFORD A, METZ L, CHINTALA S. Unsupervised representation learning with deep convolutional generative adversarial networks[J]. arXiv:1511.06434, 2015.
[18] CHEN X, DUAN Y, HOUTHOOFT R, et al. InfoGAN: interpretable representation learning by information max-imizing generative adversarial nets[C]//Proceedings of the Annual Conference on Neural Information Processing Systems, Barcelona, Dec 5-10, 2016. Red Hook: Curran Associates, 2016: 2172-2180.
[19] LIU E H, HUANG S, GU X, et al. An extension method of deep learning applications test case set based on GAN[C]// Proceedings of the 18th China Fault Tolerant Computing Conference, Beijing, Aug 14-17, 2019: 613-620.
刘二虎, 黄松, 顾雄, 等. 一种基于 GAN的深度学习应用系统测试用例集扩充方法[C]//CFTC2019: 第18届全国容错计算学术会议论文集, 北京, 2019: 613-620.
[20] XIAO C, LI B, ZHU J Y, et al. Generating adversarial examples with adversarial networks[J]. arXiv:1801.02610, 2018.
[21] BUCILUǎ C, CARUANA R, Niculescu-Mizil A. Model compression[C]//Proceedings of the 12th ACM SIGKDD Inter-national Conference on Knowledge Discovery and Data Mining, Philadelphia, Aug 20-23, 2006. New York: ACM, 2006: 535-541.
[22] BA J, CARUANA R. Do deep nets really need to be deep?[C]//Proceedings of the 27th International Conference on Neural Information Processing Systems, Montreal, Dec 8-13, 2014. Red Hook: Curran Associates, 2014: 2654-2662.
[23] HINTON G, VINYALS O, DEAN J. Distilling the knowl-edge in a neural network[J]. arXiv:1503.02531, 2015.
[24] ROMERO A, BALLAS N, KAHOU S E, et al. Fitnets: hints for thin deep nets[J]. arXiv:14126550, 2014.
[25] DONG G, GAO J, DU R, et al. Robustness of network of networks under targeted attack[J]. Physical Review E, 2013, 87(5): 052804.
[26] WANG Z, BOVIK A C, SHEIKH H R, et al. Image quality assessment: from error visibility to structural similarity[J]. IEEE Transactions on Image Processing, 2004, 13(4): 600-612.
[27] ROSEBROCK A. Fingerprinting images for near-duplicate detection[EB/OL]. [2020-02-21]. https://realpython.com/fin-gerprinting-images-for-near-duplicate-detection.