计算机科学与探索 ›› 2022, Vol. 16 ›› Issue (10): 2286-2297.DOI: 10.3778/j.issn.1673-9418.2103057
肖茂1,2, 郭春1,2,+(), 申国伟1,2, 蒋朝惠1,2
收稿日期:
2021-03-17
修回日期:
2021-05-14
出版日期:
2022-10-01
发布日期:
2021-05-18
通讯作者:
+ E-mail: gc_gzedu@163.com作者简介:
肖茂(1996—),男,硕士研究生,CCF会员,主要研究方向为网络安全、恶意软件检测。基金资助:
XIAO Mao1,2, GUO Chun1,2,+(), SHEN Guowei1,2, JIANG Chaohui1,2
Received:
2021-03-17
Revised:
2021-05-14
Online:
2022-10-01
Published:
2021-05-18
About author:
XIAO Mao, born in 1996, M.S. candidate, member of CCF. His research interests include network security and malware detection.Supported by:
摘要:
基于灰度图的恶意软件检测方法由于不需要反汇编且具有检测准确率高的特点而备受关注。现今已有一些针对该类检测方法的对抗攻击,然而当前大部分对抗攻击方法无法确保所生成的对抗样本仍保留原PE文件的可用性或功能性,或是选择在通过文件头信息便能进行准确检测的PE文件底部添加字节码。通过分析PE文件的区段对齐机制以及文件对齐机制,提出一种可保留PE文件可用性和功能性的字节码攻击方法(BARAF)。该方法通过在由文件对齐机制产生的间隙空间和源于区段对齐机制而具有的扩展空间内批量修改或添加字节码来生成可保留可用性和功能性的对抗样本,来欺骗基于灰度图像的恶意软件检测方法。实验结果表明,BARAF生成的对抗样本最多能使基于灰度图的恶意软件检测方法的准确率下降31.58个百分点,并且难以通过文件头信息对其进行准确检测。
中图分类号:
肖茂, 郭春, 申国伟, 蒋朝惠. 可保留可用性和功能性的对抗样本[J]. 计算机科学与探索, 2022, 16(10): 2286-2297.
XIAO Mao, GUO Chun, SHEN Guowei, JIANG Chaohui. Adversarial Example Remaining Availability and Functionality[J]. Journal of Frontiers of Computer Science and Technology, 2022, 16(10): 2286-2297.
序号 | 类别 | 家族 | 训练样本数 | 测试样本数 |
---|---|---|---|---|
1 | Benign | Benign | 412 | 176 |
2 | Backdoor | PcClient | 299 | 128 |
3 | Backdoor | Poison | 180 | 77 |
4 | Backdoor | Shark | 135 | 58 |
5 | Backdoor | Small | 260 | 112 |
6 | Ransomware | Ceber | 83 | 35 |
7 | Ransomware | Gandcrab | 111 | 48 |
8 | Rootkit | Agent | 99 | 42 |
9 | Trojan-Banker | Banker | 117 | 50 |
10 | Trojan-Downloader | FraudLoad | 190 | 81 |
11 | Trojan-Downloader | Hmir | 129 | 55 |
12 | Trojan-Downloader | VB | 183 | 78 |
13 | Trojan-Downloader | Zlob | 346 | 149 |
14 | Trojan-GameThief | Magania | 109 | 47 |
15 | Trojan-GameThief | OnLineGames | 149 | 64 |
16 | Trojan-Spy | Pophot | 198 | 85 |
17 | Trojan-Spy | Zbot | 282 | 121 |
表1 实验数据集详情
Table 1 Detail of experimental dataset
序号 | 类别 | 家族 | 训练样本数 | 测试样本数 |
---|---|---|---|---|
1 | Benign | Benign | 412 | 176 |
2 | Backdoor | PcClient | 299 | 128 |
3 | Backdoor | Poison | 180 | 77 |
4 | Backdoor | Shark | 135 | 58 |
5 | Backdoor | Small | 260 | 112 |
6 | Ransomware | Ceber | 83 | 35 |
7 | Ransomware | Gandcrab | 111 | 48 |
8 | Rootkit | Agent | 99 | 42 |
9 | Trojan-Banker | Banker | 117 | 50 |
10 | Trojan-Downloader | FraudLoad | 190 | 81 |
11 | Trojan-Downloader | Hmir | 129 | 55 |
12 | Trojan-Downloader | VB | 183 | 78 |
13 | Trojan-Downloader | Zlob | 346 | 149 |
14 | Trojan-GameThief | Magania | 109 | 47 |
15 | Trojan-GameThief | OnLineGames | 149 | 64 |
16 | Trojan-Spy | Pophot | 198 | 85 |
17 | Trojan-Spy | Zbot | 282 | 121 |
被攻击方法 | 攻击方法 | BCT | Accuracy/% | MR/% | AAV/Byte |
---|---|---|---|---|---|
Gray+ VGG16+ SVM | — | — | 90.33 | — | — |
GBCA | 0x00 | 90.33 | 0 | 2 702 | |
0x80 | 88.72 | 1.61 | |||
0xFF | 76.76 | 13.57 | |||
EBCA | 0x00 | 72.91 | 17.42 | 7 743 | |
0x80 | 75.93 | 14.40 | |||
0xFF | 63.66 | 26.67 | |||
MBCA | 0x00 | 72.91 | 17.42 | 8 776 | |
0x80 | 73.25 | 17.08 | |||
0xFF | 58.75 | 31.58 |
表2 BARAF对Gray +VGG16+SVM的性能影响
Table 2 Impact of BARAF on performance of Gray +VGG16+SVM
被攻击方法 | 攻击方法 | BCT | Accuracy/% | MR/% | AAV/Byte |
---|---|---|---|---|---|
Gray+ VGG16+ SVM | — | — | 90.33 | — | — |
GBCA | 0x00 | 90.33 | 0 | 2 702 | |
0x80 | 88.72 | 1.61 | |||
0xFF | 76.76 | 13.57 | |||
EBCA | 0x00 | 72.91 | 17.42 | 7 743 | |
0x80 | 75.93 | 14.40 | |||
0xFF | 63.66 | 26.67 | |||
MBCA | 0x00 | 72.91 | 17.42 | 8 776 | |
0x80 | 73.25 | 17.08 | |||
0xFF | 58.75 | 31.58 |
生成成功率 | 攻击方法 | ||
---|---|---|---|
EBCA | GBCA | MBCA | |
SR | 63.02 | 73.76 | 41.04 |
表3 对抗样本生成成功率 单位:%
Table 3 Creation success rate of adversarial examples
生成成功率 | 攻击方法 | ||
---|---|---|---|
EBCA | GBCA | MBCA | |
SR | 63.02 | 73.76 | 41.04 |
序号 | 家族 | EBCA生成 成功数 | GBCA生成 成功数 | MBCA生成 成功数 |
---|---|---|---|---|
1 | Benign | 131 | 145 | 104 |
2 | B.PcClient | 104 | 120 | 96 |
3 | B.Poison | 8 | 63 | 4 |
4 | B.Shark | 6 | 11 | 0 |
5 | B.Small | 0 | 112 | 0 |
6 | RS.Ceber | 35 | 35 | 35 |
7 | RS.Gandcrab | 48 | 47 | 47 |
8 | R.Agent | 16 | 38 | 13 |
9 | TB.Banker | 50 | 0 | 0 |
10 | TD.FraudLoad | 78 | 33 | 33 |
11 | TD.Hmir | 23 | 32 | 0 |
12 | TD.VB | 18 | 63 | 4 |
13 | TD.Zlob | 116 | 136 | 103 |
14 | TG.Magania | 47 | 18 | 18 |
15 | TG.OnLineGames | 0 | 64 | 0 |
16 | TS.Pophot | 85 | 85 | 85 |
17 | TS.Zbot | 121 | 35 | 35 |
总计 | — | 886 | 1 037 | 577 |
表4 各家族对抗样本生成结果
Table 4 Generation result of adversarial examples of each family
序号 | 家族 | EBCA生成 成功数 | GBCA生成 成功数 | MBCA生成 成功数 |
---|---|---|---|---|
1 | Benign | 131 | 145 | 104 |
2 | B.PcClient | 104 | 120 | 96 |
3 | B.Poison | 8 | 63 | 4 |
4 | B.Shark | 6 | 11 | 0 |
5 | B.Small | 0 | 112 | 0 |
6 | RS.Ceber | 35 | 35 | 35 |
7 | RS.Gandcrab | 48 | 47 | 47 |
8 | R.Agent | 16 | 38 | 13 |
9 | TB.Banker | 50 | 0 | 0 |
10 | TD.FraudLoad | 78 | 33 | 33 |
11 | TD.Hmir | 23 | 32 | 0 |
12 | TD.VB | 18 | 63 | 4 |
13 | TD.Zlob | 116 | 136 | 103 |
14 | TG.Magania | 47 | 18 | 18 |
15 | TG.OnLineGames | 0 | 64 | 0 |
16 | TS.Pophot | 85 | 85 | 85 |
17 | TS.Zbot | 121 | 35 | 35 |
总计 | — | 886 | 1 037 | 577 |
评价指标 | 攻击方法 | |
---|---|---|
GBCA | EBCA | |
TPR | 100.00 | 100.00 |
FPR | 17.07 | 59.81 |
表5 通过文件头信息检测不同攻击后的测试集的结果 单位:%
Table 5 Detected results obtained by file header information on test sets attacked by different attacks
评价指标 | 攻击方法 | |
---|---|---|
GBCA | EBCA | |
TPR | 100.00 | 100.00 |
FPR | 17.07 | 59.81 |
评价点 | 攻击方法 | |||
---|---|---|---|---|
ATMPA[ | RSRC+FGSM[ | COPYCAT[ | BARAF | |
保留可用性 | 否 | 是 | 是 | 是 |
保留功能性 | 否 | 否 | 是 | 是 |
检测准确性 | 较低 | 较低 | 较高 | 较低 |
表6 不同攻击方法所生成的对抗样本对比
Table 6 Comparison of adversarial examples generated by different attack methods
评价点 | 攻击方法 | |||
---|---|---|---|---|
ATMPA[ | RSRC+FGSM[ | COPYCAT[ | BARAF | |
保留可用性 | 否 | 是 | 是 | 是 |
保留功能性 | 否 | 否 | 是 | 是 |
检测准确性 | 较低 | 较低 | 较高 | 较低 |
[1] | ROUSE M. Malware (malicious software)[EB/OL]. [2021-01-09]. https://searchsecurity.techtarget.com/definition/malware. |
[2] | AV-TEST. Malware[EB/OL]. [2021-01-09]. https://www.av-test.org/en/statistics/malware/. |
[3] | Symantec. ISTR[EB/OL]. [2021-01-09]. https://www.symantec.com/content/dam/symantec/docs/reports/istr-23-2018-en.pdf. |
[4] | AV-TEST. Security report 2016-2017[EB/OL]. [2021-01-09]. https://www.av-test.org/fileadmin/pdf/security_report/AV-TEST_ Security_Report_2016-2017.pdf. |
[5] | MOURTAJI Y, BOUHORMA M, ALGHAZZAWI D M. Intelligent framework for malware detection with convolutional neural network[C]// Proceedings of the 2nd International Conference on Networking, Information Systems & Security, Rabat, Mar 27-29, 2019. New York: ACM, 2019: 1-6. |
[6] | AGARAP A F. Towards building an intelligent anti-malware system: a deep learning approach using support vector machine (SVM) for malware classification[J]. arXiv:1801.00318, 2017. |
[7] | KALASH M, ROCHAN M, MOHAMMED N, et al. Malware classification with deep convolutional neural networks[C]// Proceedings of the 9th IFIP International Conference on New Technologies, Mobility and Security, Paris, Feb 26-28, 2018. Piscataway: IEEE, 2018: 1-5. |
[8] |
DAI Y S, LI H, QIAN Y K, et al. A malware classification method based on memory dump grayscale image[J]. Digital Investigation, 2018, 27: 30-37.
DOI URL |
[9] | KANCHERLA K, MUKKAMALA S. Image visualization based malware detection[C]// Proceedings of the 2013 IEEE Symposium on Computational Intelligence in Cyber Security, Singapore, Apr 16-19, 2013. Piscataway: IEEE, 2013: 40-44. |
[10] | NAEEM H, GUO B, ULLAH F, et al. A cross-platform malware variant classification based on image representation[J]. KSII Transactions on Internet & Information Systems, 2019, 13(7): 3756-3777. |
[11] | ZHOU X, PANG J, LIANG G. Image classification for malware detection using extremely randomized trees[C]// Proceedings of the 2017 11th IEEE International Conference on Anti-counterfeiting, Security, and Identification, Xiamen, Oct 27-29, 2017. Piscataway: IEEE, 2017: 54-59. |
[12] | KREUK F, BARAK A, AVIV-REUVEN S, et al. Deceiving end-to-end deep learning malware detectors using adversarial examples[J]. arXiv:1802.04528, 2018. |
[13] | KOLOSNJAJI B, DEMONTIS A, BIGGIO B, et al. Adversarial malware binaries: evading deep learning for malware detection in executables[C]// Proceedings of the 26th European Signal Processing Conference, Roma, Sep 3-7, 2018. Piscataway: IEEE, 2018: 533-537. |
[14] | SUCIU O, COULL S E, JOHNS J. Exploring adversarial examples in malware detection[C]// Proceedings of the 2019 IEEE Security and Privacy Workshops, San Francisco, May 19-23, 2019. Piscataway: IEEE, 2019: 8-14. |
[15] | ANDERSON H S, KHARKAR A, FILAR B, et al. Learning to evade static PE machine learning malware models via reinforcement learning[J]. arXiv:1801.08917, 2018. |
[16] | LIU X B, ZHANG J L, LIN Y P, et al. ATMPA: attacking machine learning-based malware visualization detection methods via adversarial examples[C]// Proceedings of the 2019 International Symposium on Quality of Service, Phoenix, Jun 24-25, 2019. New York: ACM, 2019: 1-10. |
[17] | VI B N, NGUYEN H N, NGUYEN N T, et al. Adversarial examples against image-based malware classification systems[C]// Proceedings of the 11th International Conference on Knowledge and Systems Engineering, Da Nang, Oct 24-26, 2019. Piscataway: IEEE, 2019: 1-5. |
[18] | KHORMALI A, ABUSNAINA A, CHEN S, et al. COPYCAT: practical adversarial attacks on visualization-based malware detection[J]. arXiv:1909.09735, 2019. |
[19] | RAFF E, BARKER J, SYLVESTER J, et al. Malware detection by eating a whole EXE[J]. arXiv:1710.09435, 2017. |
[20] | GOODFELLOW I J, SHLENS J, SZEGEDY C. Explaining and harnessing adversarial examples[J]. arXiv:1412.6572, 2014. |
[21] | CARLINI N, WAGNER D A. Towards evaluating the robustness of neural networks[C]// Proceedings of the 2017 IEEE Symposium on Security and Privacy, San Jose, May 22-26, 2017. Washington: IEEE Computer Society, 2017: 39-57. |
[22] | NARAYANAN B N, DJANEYE-BOUNDJOU O, KEBEDE T M. Performance analysis of machine learning and pattern recognition algorithms for malware classification[C]// Proceedings of the 2016 IEEE National Aerospace and Electronics Conference and Ohio Innovation Summit, Dayton, Jul 25-29, 2016. Piscataway: IEEE, 2016: 338-342. |
[23] | DAVULURU V S P, NARAYANAN B N, BALSTER E J. Convolutional neural networks as classification tools and feature extractors for distinguishing malware programs[C]// Proceedings of the 2019 IEEE National Aerospace and Electronics Conference, Dayton, Jul 15-19, 2019. Piscataway: IEEE, 2019: 273-278. |
[24] | 郭春, 陈长青, 申国伟, 等. 一种基于可视化的勒索软件分类方法[J]. 信息网络安全, 2020, 20(4): 31-39. |
GUO C, CHEN C Q, SHEN G W, et al. A ransomware classification method based on visualization[J]. Netinfo Security, 2020, 20(4): 31-39. | |
[25] | PAPERNOT N, MCDANIEL P D, JHA S, et al. The limitations of deep learning in adversarial settings[C]// Proceedings of the 2016 IEEE European Symposium on Security and Privacy, Saarbrücken, Mar 21-24, 2016. Piscataway: IEEE, 2016: 372-387. |
[26] | MOOSAVI-DEZFOOLI S M, FAWZI A, FROSSARD P. DeepFool: a simple and accurate method to fool deep neural networks[C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, Jun 27-30, 2016. Washington: IEEE Computer Society, 2016: 2574-2582. |
[27] | GOODFELLOW I J, POUGET-ABADIE J, MIRZA M, et al. Generative adversarial networks[C]// Proceedings of the 28th Neural Information Processing Systems, Montreal, Dec 8-13, 2014: 2672-2680. |
[28] |
李俊杰, 王茜. 感知相似的图像分类对抗样本生成模型[J]. 计算机科学与探索, 2020, 14(11): 1930-1942.
DOI |
LI J J. WANG Q. Perceptually similar image classification adversarial example generation model[J]. Journal of Frontiers of Computer Science and Technology, 2020, 14(11): 1930-1942.
DOI |
|
[29] |
李盼, 赵文涛, 刘强, 等. 机器学习安全性问题及其防御技术研究综述[J]. 计算机科学与探索, 2018, 12(2): 171-184.
DOI |
LI P, ZHAO W T, LIU Q, et al. Security issues and their countermeasuring techniques of machine learning: a survey[J]. Journal of Frontiers of Computer Science and Technology, 2018, 12(2): 171-184.
DOI |
|
[30] | PAPERNOT N, GOODFELLOW I, SHEATSLEY R, et al. Cleverhans v1.0.0: an adversarial machine learning library[J]. arXiv:1610.00768, 2016. |
[31] | Microsoft Corporation. Visual studio, Microsoft portable executable and common object file format specification[EB/OL]. [2020-05-10]. http://www.microsoft.com/whdc/system/platform/firmware/PECOFF.mspx. |
[32] | Vxheaven. Vxheavens[EB/OL]. [2020-04-12]. https://archive.org/download/vxheavens-2010-05-18. |
[33] | Virusshare. Virusshare[EB/OL]. [2021-01-09]. https://virusshare.com. |
[1] | 王曙燕, 金航, 孙家泽. GAN图像对抗样本生成方法[J]. 计算机科学与探索, 2021, 15(4): 702-711. |
[2] | 李俊杰,王茜. 感知相似的图像分类对抗样本生成模型[J]. 计算机科学与探索, 2020, 14(11): 1930-1942. |
[3] | 李盼,赵文涛,刘强,崔建京,殷建平. 机器学习安全性问题及其防御技术研究综述[J]. 计算机科学与探索, 2018, 12(2): 171-184. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||