可保留可用性和功能性的对抗样本

doi:10.3778/j.issn.1673-9418.2103057

计算机科学与探索 ›› 2022, Vol. 16 ›› Issue (10): 2286-2297.DOI: 10.3778/j.issn.1673-9418.2103057

可保留可用性和功能性的对抗样本

肖茂¹^,², 郭春¹^,²^,⁺(), 申国伟¹^,², 蒋朝惠¹^,²

1.贵州大学计算机科学与技术学院,贵阳 550025
2.公共大数据国家重点实验室,贵阳 550025

收稿日期:2021-03-17 修回日期:2021-05-14 出版日期:2022-10-01 发布日期:2021-05-18
通讯作者: + E-mail: gc_gzedu@163.com
作者简介:肖茂（1996—）,男,硕士研究生,CCF会员,主要研究方向为网络安全、恶意软件检测。
郭春（1986—）,男,博士,副教授,硕士生导师,CCF会员,主要研究方向为数据挖掘、入侵检测、恶意代码检测等。
申国伟（1986—）,男,博士,副教授,硕士生导师,CCF会员,主要研究方向为网络与信息安全、大数据。
蒋朝惠（1965—）,男,教授,硕士生导师,主要研究方向为网络与信息安全、入侵检测等。
基金资助:
国家自然科学基金(62062022);贵州省科学技术基金([2020]1Y268)

Adversarial Example Remaining Availability and Functionality

XIAO Mao¹^,², GUO Chun¹^,²^,⁺(), SHEN Guowei¹^,², JIANG Chaohui¹^,²

1. School of Computer Science and Technology, Guizhou University, Guiyang 550025, China
2. State Key Laboratory of Public Big Data, Guiyang 550025, China

Received:2021-03-17 Revised:2021-05-14 Online:2022-10-01 Published:2021-05-18
About author:XIAO Mao, born in 1996, M.S. candidate, member of CCF. His research interests include network security and malware detection.
GUO Chun, born in 1986, Ph.D., associate professor, M.S. supervisor, member of CCF. His research interests include data mining, intrusion detection, malware detection, etc.
SHEN Guowei, born in 1986, Ph.D., associate professor, M.S. supervisor, member of CCF. His research interests include network and information security and big data.
JIANG Chaohui, born in 1965, professor, M.S. supervisor. His research interests include network and information security, intrusion detection, etc.
Supported by:
National Natural Science Foundation of China(62062022);Science and Technology Foundation of Guizhou Province([2020]1Y268)

摘要/Abstract

摘要：

基于灰度图的恶意软件检测方法由于不需要反汇编且具有检测准确率高的特点而备受关注。现今已有一些针对该类检测方法的对抗攻击,然而当前大部分对抗攻击方法无法确保所生成的对抗样本仍保留原PE文件的可用性或功能性,或是选择在通过文件头信息便能进行准确检测的PE文件底部添加字节码。通过分析PE文件的区段对齐机制以及文件对齐机制,提出一种可保留PE文件可用性和功能性的字节码攻击方法（BARAF）。该方法通过在由文件对齐机制产生的间隙空间和源于区段对齐机制而具有的扩展空间内批量修改或添加字节码来生成可保留可用性和功能性的对抗样本,来欺骗基于灰度图像的恶意软件检测方法。实验结果表明,BARAF生成的对抗样本最多能使基于灰度图的恶意软件检测方法的准确率下降31.58个百分点,并且难以通过文件头信息对其进行准确检测。

关键词: 对抗样本, 恶意软件检测, 灰度图, PE文件

Abstract:

Malware detection method based on gray images has received a lot of attention because it does not require disassembly and can obtain a high detection accuracy. There are some adversarial attacks against this type of detection method which has been put forward, but most of the current adversarial attack methods cannot ensure that the generated adversarial examples can remain the availability or functionality of the original PE file, or choose to add bytecode at the bottom of a PE file that is easy to be accurately detected through the file header information. Based on the analysis of the section alignment mechanism and file alignment mechanism of PE files, this paper proposes a bytecode attack method that can remain the availability and functionality (BARAF) of PE files. By modifying or adding bytecodes in the gap spaces generated by the file alignment mechanism and the expansion spaces derived from the section alignment mechanism, BARAF generates the adversarial example that can remain the availability and functionality to deceive the malware detection method based on gray images. Experimental results show that the adversarial examples generated by BARAF can reduce the accuracy of the malware detection method based on gray images by 31.58 percentage points at most, and it is difficult to detect the adversarial examples accurately through the file header information.

Key words: adversarial example, malware detection, gray image, PE file

中图分类号:

TP309.5

肖茂, 郭春, 申国伟, 蒋朝惠. 可保留可用性和功能性的对抗样本[J]. 计算机科学与探索, 2022, 16(10): 2286-2297.

XIAO Mao, GUO Chun, SHEN Guowei, JIANG Chaohui. Adversarial Example Remaining Availability and Functionality[J]. Journal of Frontiers of Computer Science and Technology, 2022, 16(10): 2286-2297.

图/表 14

参考文献 33

[1]	ROUSE M. Malware (malicious software)[EB/OL]. [2021-01-09]. https://searchsecurity.techtarget.com/definition/malware.
[2]	AV-TEST. Malware[EB/OL]. [2021-01-09]. https://www.av-test.org/en/statistics/malware/.
[3]	Symantec. ISTR[EB/OL]. [2021-01-09]. https://www.symantec.com/content/dam/symantec/docs/reports/istr-23-2018-en.pdf.
[4]	AV-TEST. Security report 2016-2017[EB/OL]. [2021-01-09]. https://www.av-test.org/fileadmin/pdf/security_report/AV-TEST_ Security_Report_2016-2017.pdf.
[5]	MOURTAJI Y, BOUHORMA M, ALGHAZZAWI D M. Intelligent framework for malware detection with convolutional neural network[C]// Proceedings of the 2nd International Conference on Networking, Information Systems & Security, Rabat, Mar 27-29, 2019. New York: ACM, 2019: 1-6.
[6]	AGARAP A F. Towards building an intelligent anti-malware system: a deep learning approach using support vector machine (SVM) for malware classification[J]. arXiv:1801.00318, 2017.
[7]	KALASH M, ROCHAN M, MOHAMMED N, et al. Malware classification with deep convolutional neural networks[C]// Proceedings of the 9th IFIP International Conference on New Technologies, Mobility and Security, Paris, Feb 26-28, 2018. Piscataway: IEEE, 2018: 1-5.
[8]	DAI Y S, LI H, QIAN Y K, et al. A malware classification method based on memory dump grayscale image[J]. Digital Investigation, 2018, 27: 30-37. DOI URL
[9]	KANCHERLA K, MUKKAMALA S. Image visualization based malware detection[C]// Proceedings of the 2013 IEEE Symposium on Computational Intelligence in Cyber Security, Singapore, Apr 16-19, 2013. Piscataway: IEEE, 2013: 40-44.
[10]	NAEEM H, GUO B, ULLAH F, et al. A cross-platform malware variant classification based on image representation[J]. KSII Transactions on Internet & Information Systems, 2019, 13(7): 3756-3777.
[11]	ZHOU X, PANG J, LIANG G. Image classification for malware detection using extremely randomized trees[C]// Proceedings of the 2017 11th IEEE International Conference on Anti-counterfeiting, Security, and Identification, Xiamen, Oct 27-29, 2017. Piscataway: IEEE, 2017: 54-59.
[12]	KREUK F, BARAK A, AVIV-REUVEN S, et al. Deceiving end-to-end deep learning malware detectors using adversarial examples[J]. arXiv:1802.04528, 2018.
[13]	KOLOSNJAJI B, DEMONTIS A, BIGGIO B, et al. Adversarial malware binaries: evading deep learning for malware detection in executables[C]// Proceedings of the 26th European Signal Processing Conference, Roma, Sep 3-7, 2018. Piscataway: IEEE, 2018: 533-537.
[14]	SUCIU O, COULL S E, JOHNS J. Exploring adversarial examples in malware detection[C]// Proceedings of the 2019 IEEE Security and Privacy Workshops, San Francisco, May 19-23, 2019. Piscataway: IEEE, 2019: 8-14.
[15]	ANDERSON H S, KHARKAR A, FILAR B, et al. Learning to evade static PE machine learning malware models via reinforcement learning[J]. arXiv:1801.08917, 2018.
[16]	LIU X B, ZHANG J L, LIN Y P, et al. ATMPA: attacking machine learning-based malware visualization detection methods via adversarial examples[C]// Proceedings of the 2019 International Symposium on Quality of Service, Phoenix, Jun 24-25, 2019. New York: ACM, 2019: 1-10.
[17]	VI B N, NGUYEN H N, NGUYEN N T, et al. Adversarial examples against image-based malware classification systems[C]// Proceedings of the 11th International Conference on Knowledge and Systems Engineering, Da Nang, Oct 24-26, 2019. Piscataway: IEEE, 2019: 1-5.
[18]	KHORMALI A, ABUSNAINA A, CHEN S, et al. COPYCAT: practical adversarial attacks on visualization-based malware detection[J]. arXiv:1909.09735, 2019.
[19]	RAFF E, BARKER J, SYLVESTER J, et al. Malware detection by eating a whole EXE[J]. arXiv:1710.09435, 2017.
[20]	GOODFELLOW I J, SHLENS J, SZEGEDY C. Explaining and harnessing adversarial examples[J]. arXiv:1412.6572, 2014.
[21]	CARLINI N, WAGNER D A. Towards evaluating the robustness of neural networks[C]// Proceedings of the 2017 IEEE Symposium on Security and Privacy, San Jose, May 22-26, 2017. Washington: IEEE Computer Society, 2017: 39-57.
[22]	NARAYANAN B N, DJANEYE-BOUNDJOU O, KEBEDE T M. Performance analysis of machine learning and pattern recognition algorithms for malware classification[C]// Proceedings of the 2016 IEEE National Aerospace and Electronics Conference and Ohio Innovation Summit, Dayton, Jul 25-29, 2016. Piscataway: IEEE, 2016: 338-342.
[23]	DAVULURU V S P, NARAYANAN B N, BALSTER E J. Convolutional neural networks as classification tools and feature extractors for distinguishing malware programs[C]// Proceedings of the 2019 IEEE National Aerospace and Electronics Conference, Dayton, Jul 15-19, 2019. Piscataway: IEEE, 2019: 273-278.
[24]	郭春, 陈长青, 申国伟, 等. 一种基于可视化的勒索软件分类方法[J]. 信息网络安全, 2020, 20(4): 31-39.
	GUO C, CHEN C Q, SHEN G W, et al. A ransomware classification method based on visualization[J]. Netinfo Security, 2020, 20(4): 31-39.
[25]	PAPERNOT N, MCDANIEL P D, JHA S, et al. The limitations of deep learning in adversarial settings[C]// Proceedings of the 2016 IEEE European Symposium on Security and Privacy, Saarbrücken, Mar 21-24, 2016. Piscataway: IEEE, 2016: 372-387.
[26]	MOOSAVI-DEZFOOLI S M, FAWZI A, FROSSARD P. DeepFool: a simple and accurate method to fool deep neural networks[C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, Jun 27-30, 2016. Washington: IEEE Computer Society, 2016: 2574-2582.
[27]	GOODFELLOW I J, POUGET-ABADIE J, MIRZA M, et al. Generative adversarial networks[C]// Proceedings of the 28th Neural Information Processing Systems, Montreal, Dec 8-13, 2014: 2672-2680.
[28]	李俊杰, 王茜. 感知相似的图像分类对抗样本生成模型[J]. 计算机科学与探索, 2020, 14(11): 1930-1942. DOI
	LI J J. WANG Q. Perceptually similar image classification adversarial example generation model[J]. Journal of Frontiers of Computer Science and Technology, 2020, 14(11): 1930-1942. DOI
[29]	李盼, 赵文涛, 刘强, 等. 机器学习安全性问题及其防御技术研究综述[J]. 计算机科学与探索, 2018, 12(2): 171-184. DOI
	LI P, ZHAO W T, LIU Q, et al. Security issues and their countermeasuring techniques of machine learning: a survey[J]. Journal of Frontiers of Computer Science and Technology, 2018, 12(2): 171-184. DOI
[30]	PAPERNOT N, GOODFELLOW I, SHEATSLEY R, et al. Cleverhans v1.0.0: an adversarial machine learning library[J]. arXiv:1610.00768, 2016.
[31]	Microsoft Corporation. Visual studio, Microsoft portable executable and common object file format specification[EB/OL]. [2020-05-10]. http://www.microsoft.com/whdc/system/platform/firmware/PECOFF.mspx.
[32]	Vxheaven. Vxheavens[EB/OL]. [2020-04-12]. https://archive.org/download/vxheavens-2010-05-18.
[33]	Virusshare. Virusshare[EB/OL]. [2021-01-09]. https://virusshare.com.

序号	类别	家族	训练样本数	测试样本数
1	Benign	Benign	412	176
2	Backdoor	PcClient	299	128
3	Backdoor	Poison	180	77
4	Backdoor	Shark	135	58
5	Backdoor	Small	260	112
6	Ransomware	Ceber	83	35
7	Ransomware	Gandcrab	111	48
8	Rootkit	Agent	99	42
9	Trojan-Banker	Banker	117	50
10	Trojan-Downloader	FraudLoad	190	81
11	Trojan-Downloader	Hmir	129	55
12	Trojan-Downloader	VB	183	78
13	Trojan-Downloader	Zlob	346	149
14	Trojan-GameThief	Magania	109	47
15	Trojan-GameThief	OnLineGames	149	64
16	Trojan-Spy	Pophot	198	85
17	Trojan-Spy	Zbot	282	121

序号	类别	家族	训练样本数	测试样本数
1	Benign	Benign	412	176
2	Backdoor	PcClient	299	128
3	Backdoor	Poison	180	77
4	Backdoor	Shark	135	58
5	Backdoor	Small	260	112
6	Ransomware	Ceber	83	35
7	Ransomware	Gandcrab	111	48
8	Rootkit	Agent	99	42
9	Trojan-Banker	Banker	117	50
10	Trojan-Downloader	FraudLoad	190	81
11	Trojan-Downloader	Hmir	129	55
12	Trojan-Downloader	VB	183	78
13	Trojan-Downloader	Zlob	346	149
14	Trojan-GameThief	Magania	109	47
15	Trojan-GameThief	OnLineGames	149	64
16	Trojan-Spy	Pophot	198	85
17	Trojan-Spy	Zbot	282	121

被攻击方法	攻击方法	BCT	Accuracy/%	MR/%	AAV/Byte
Gray+ VGG16+ SVM	—	—	90.33	—	—
	GBCA	0x00	90.33	0	2 702
		0x80	88.72	1.61
		0xFF	76.76	13.57
	EBCA	0x00	72.91	17.42	7 743
		0x80	75.93	14.40
		0xFF	63.66	26.67
	MBCA	0x00	72.91	17.42	8 776
		0x80	73.25	17.08
		0xFF	58.75	31.58

被攻击方法	攻击方法	BCT	Accuracy/%	MR/%	AAV/Byte
Gray+ VGG16+ SVM	—	—	90.33	—	—
	GBCA	0x00	90.33	0	2 702
		0x80	88.72	1.61
		0xFF	76.76	13.57
	EBCA	0x00	72.91	17.42	7 743
		0x80	75.93	14.40
		0xFF	63.66	26.67
	MBCA	0x00	72.91	17.42	8 776
		0x80	73.25	17.08
		0xFF	58.75	31.58

生成成功率	攻击方法
生成成功率	EBCA	GBCA	MBCA
SR	63.02	73.76	41.04

可保留可用性和功能性的对抗样本

Adversarial Example Remaining Availability and Functionality

RichHTML

PDF

可视化

摘要/Abstract

引用本文

使用本文

图/表 14

参考文献 33

相关文章 3

编辑推荐

Metrics

序号	家族	EBCA生成成功数	GBCA生成成功数	MBCA生成成功数
1	Benign	131	145	104
2	B.PcClient	104	120	96
3	B.Poison	8	63	4
4	B.Shark	6	11	0
5	B.Small	0	112	0
6	RS.Ceber	35	35	35
7	RS.Gandcrab	48	47	47
8	R.Agent	16	38	13
9	TB.Banker	50	0	0
10	TD.FraudLoad	78	33	33
11	TD.Hmir	23	32	0
12	TD.VB	18	63	4
13	TD.Zlob	116	136	103
14	TG.Magania	47	18	18
15	TG.OnLineGames	0	64	0
16	TS.Pophot	85	85	85
17	TS.Zbot	121	35	35
总计	—	886	1 037	577

评价点	攻击方法
评价点	ATMPA^[17]	RSRC+FGSM^[18]	COPYCAT^[19]	BARAF
保留可用性	否	是	是	是
保留功能性	否	否	是	是
检测准确性	较低	较低	较高	较低

[1]	王曙燕, 金航, 孙家泽. GAN图像对抗样本生成方法[J]. 计算机科学与探索, 2021, 15(4): 702-711.
[2]	李俊杰，王茜. 感知相似的图像分类对抗样本生成模型[J]. 计算机科学与探索, 2020, 14(11): 1930-1942.
[3]	李盼，赵文涛，刘强，崔建京，殷建平. 机器学习安全性问题及其防御技术研究综述[J]. 计算机科学与探索, 2018, 12(2): 171-184.

评价指标	攻击方法
评价指标	GBCA	EBCA
TPR	100.00	100.00
FPR	17.07	59.81

评价指标	攻击方法
评价指标	GBCA	EBCA
TPR	100.00	100.00
FPR	17.07	59.81