计算机科学与探索 ›› 2021, Vol. 15 ›› Issue (12): 2315-2326.DOI: 10.3778/j.issn.1673-9418.2007087

• 网络与信息安全 • 上一篇    下一篇

利用序列分析的远控木马早期检测方法研究

王晨,郭春,申国伟,崔允贺   

  1. 1. 贵州大学 计算机科学与技术学院,贵阳 550025
    2. 公共大数据国家重点实验室,贵阳 550025
  • 出版日期:2021-12-01 发布日期:2021-12-09

Research of Remote Access Trojan Early Detection Method Using Sequence Analysis

WANG Chen, GUO Chun, SHEN Guowei, CUI Yunhe   

  1. 1. School of Computer Science and Technology, Guizhou University, Guiyang 550025, China
    2. State Key Laboratory of Public Big Data, Guiyang 550025, China
  • Online:2021-12-01 Published:2021-12-09

摘要:

远控木马(RAT)是一类以窃取机密信息为主要目的的恶意程序,严重威胁着网络空间安全。现阶段基于网络的远控木马检测方法大多对数据流的完整性有较高的要求,其检测存在一定程度的滞后。在分析远控木马通信会话建立后初期流量的序列特性的基础上,提出了一种利用序列分析的远控木马早期检测方法。该方法以远控木马被控端和控制端交互中第一条TCP流为分析对象,重点关注流中由内部主机向外部网络发送且数据包传输层负载大于[α]字节的第一个数据包(上线包)及其后续数个数据包,从中提取包含传输负载大小序列、传输字节数和时间间隔在内的三维特征并运用机器学习算法构建了高效的早期检测模型。实验结果表明,该方法具备快速检测远控木马的能力,其通过远控木马会话建立后初期的少量数据包即可高准确率地检测出远控木马流量。

关键词: 远控木马(RAT), 序列分析, 早期检测, 网络通信行为

Abstract:

Remote access Trojan (RAT) is a kind of malware. The main intent of RAT is to steal confidential information and it seriously threatens the security of cyberspace. Most of current network-based RAT detection methods have high requirement on the integrity of the data stream, and their detection are delayed to a certain extent. Based on the analysis of the sequence characteristics of the initial traffic of RAT after the session is established, this paper proposes an RAT early detection method using sequence analysis. The proposed method takes the first TCP stream in the interaction between the RAT??s controlled and control ends as the analysis object, and focuses on the first packet that is sent from the internal host to the external network in the stream and whose transmission layer payload is greater than [α] bytes (called information return packet) as well as several subsequent packets. In the proposed method, three-dimensional features including transmission payload size sequence, transmission byte and time interval are extracted, and a machine learning algorithm is used to construct an efficient early detection model. Experimental results show that this method has the ability to quickly detect RAT, and it can detect RAT traffic with a high accuracy through a small number of data packets in the early stage.

Key words: remote access Trojan (RAT), sequence analysis, early detection, network communication behavior