计算机科学与探索 ›› 2024, Vol. 18 ›› Issue (3): 805-817.DOI: 10.3778/j.issn.1673-9418.2304045

• 网络·安全 • 上一篇    下一篇

融合SENet和Transformer的应用层协议识别方法

陈乾,洪征,司健鹏   

  1. 中国人民解放军陆军工程大学 指挥控制工程学院,南京 210000
  • 出版日期:2024-03-01 发布日期:2024-03-01

Application Layer Protocol Recognition Incorporating SENet and Transformer

CHEN Qian, HONG Zheng, SI Jianpeng   

  1. Command and Control Engineering College, Army Engineering University of PLA, Nanjing 210000, China
  • Online:2024-03-01 Published:2024-03-01

摘要: 协议识别技术在网络通信和信息安全领域具有至关重要的地位和作用。针对现有基于时空特征的协议识别方法提取协议特征不充分、不全面的问题,提出了一种基于SENet和Transformer的应用层协议识别方法。该方法关注协议数据的时空特征,由加入SENet注意力的残差网络构成的空间特征提取模块和Transformer网络编码器构成的时间提取模块组成。空间特征提取阶段,在残差网络结构中加入SE块获取多个卷积通道间的联系,自适应地为通道分配权重,提取不同通道中更加活跃的协议空间特征;时间特征提取阶段,由基于多头注意力机制的Transformer编码器通过堆叠的方式构建时间特征提取模块,利用输入数据的位置信息全面地获取协议数据的时间特征。通过对更加充足的空间特征和更加全面的时间特征的提取和学习,可以获得更有效的协议识别信息,进而提高协议识别性能。在ISCX2012和CSE_CIC_IDS2018混合数据集上的实验结果表明,所提模型的总体识别准确率达到99.20%,[F1]值达到98.99%,高于对比模型。

关键词: SENet, 残差网络, 自注意力, Transformer, 协议识别, 网络安全

Abstract: Protocol recognition technology assumes a crucial position and exerts significant influence in the domains of network communication and information security. Existing protocol recognition methods based on spatio-temporal features cannot adequately and comprehensively extract protocol features. An application layer protocol recognition method incorporating SENet channel attention and Transformer is proposed. The model focuses on spatio- temporal feature extraction of protocol data, and the model consists of a spatial feature extraction module and a time extraction module. SE blocks are added to the residual network to capture the associations between multiple channels and adaptively assign weights, so as to extract the key space features in different channels. The temporal feature extraction module is constructed by stacking the transformer encoders based on multi-head attention mechanism. This module is used to comprehensively capture temporal features of the protocol data by directly leveraging the positional information of the input data. After extracting and learning more detailed spatial features and more comprehensive temporal features, better protocol feature representation is obtained to improve protocol recognition performance. Experiments are conducted on the ISCX2012 and CSE_CIC_IDS2018 hybrid datasets, and the results show that the overall recognition accuracy of the proposed model reaches 99.20%, and the [F1] score reaches 98.99%, which are higher than those of the comparison models.

Key words: SENet, residual network, self-attention, Transformer, protocol recognition, network security