Wireless Network Intrusion Detection Algorithm Based on Multiple Perspectives Hierarchical Clustering

doi:10.3778/j.issn.1673-9418.2104115

Journal of Frontiers of Computer Science and Technology ›› 2022, Vol. 16 ›› Issue (12): 2752-2764.DOI: 10.3778/j.issn.1673-9418.2104115

• Network and Information Security • Previous Articles Next Articles

Wireless Network Intrusion Detection Algorithm Based on Multiple Perspectives Hierarchical Clustering

DONG Xinyu¹^,², XIE Bin¹^,²^,³^,⁺(), ZHAO Xusheng¹, GAO Xinbao¹

1. College of Computer and Cyber Security, Hebei Normal University, Shijiazhuang 050024, China
2. Hebei Provincial Key Laboratory of Network & Information Security, Hebei Normal University, Shijiazhuang 050024, China
3. Hebei Provincial Engineering Research Center for Supply Chain Big Data Analytics & Data Security, Hebei Normal University, Shijiazhuang 050024, China

Received:2021-05-08 Revised:2021-06-25 Online:2022-12-01 Published:2021-06-16
About author:DONG Xinyu, born in 1995, M.S. Her research interests include machine learning and cyber security.
XIE Bin, born in 1976, Ph.D., professor, M.S. supervisor. His research interests include granu-lar computing, machine learning and approximate reasoning.
ZHAO Xusheng, born in 2000. His research interests include machine learning and cyber security.
GAO Xinbao, born in 1999. His research interests include machine learning and cyber security.
Supported by:
National Natural Science Foundation of China(62076088);Natural Science Foundation of Hebei Provincial Education Department(QN2021083);Technological Innovation Foundation of Hebei Normal University(L2020K09)

多视角层次聚类下的无线网络入侵检测算法

董新玉¹^,², 解滨¹^,²^,³^,⁺(), 赵旭升¹, 高新宝¹

1.河北师范大学计算机与网络空间安全学院，石家庄 050024
2.河北师范大学河北省网络与信息安全重点实验室，石家庄 050024
3.河北师范大学供应链大数据分析与数据安全河北省工程研究中心，石家庄 050024

通讯作者: +E-mail: xiebin_hebtu@126.com
作者简介:董新玉（1995—），女，河北石家庄人，硕士，主要研究方向为机器学习、网络安全。
解滨（1976—），男，吉林通化人，博士，教授，硕士生导师，主要研究方向为粒计算、机器学习、近似推理。
赵旭升（2000—），男，河北石家庄人，主要研究方向为机器学习、网络安全。
高新宝（1999—），男，河北沧州人，主要研究方向为机器学习、网络安全。
基金资助:
国家自然科学基金(62076088);河北省教育厅自然科学基金项目(QN2021083);河北师范大学技术创新基金项目(L2020K09)

Abstract

Abstract:

Aiming at the problems of high false detection rate, difficult to find unknown attack behavior and high cost of obtaining marked data in existing wireless network intrusion detection algorithms based on supervised learning, this paper proposes an unsupervised wireless network intrusion detection algorithm based on multiple perspectives hierarchical clustering. The algorithm is based on unsupervised learning, and does not need to manually mark a large number of wireless network data participating in classifier learning. It has the advantages of easy access to training datasets and detection of unknown types of attack behavior. At the same time, the algorithm introduces multiple perspectives cosine distance as the similarity measure between wireless network data objects in hierarchical clustering, which makes the clustering results more reasonable and the judgment of network data behavior more accurate, and reduces the false detection rate of intrusion detection to a certain extent. In this paper, Aegean WIFI intrusion dataset (AWID) is selected as the experimental dataset, and principal component analysis is used to reduce the dimension of the experimental dataset, which greatly reduces the time complexity of intrusion detection algorithm. Experimental results show that the proposed wireless network intrusion detection algorithm based on multiple perspectives hierarchical clustering has a significant improvement in detection rate, false detection rate and detection of unknown attack types compared with traditional wireless network intrusion detection algorithms.

Key words: multiple perspectives, hierarchical clustering, wireless network, intrusion detection, principal component analysis (PCA)

摘要：

针对现有基于监督学习的无线网络入侵检测算法误检率高、难以发现未知类型攻击行为、获取带标记网络数据代价大的问题，提出一种基于多视角层次聚类的无监督无线网络入侵检测算法。该算法基于无监督学习，不需要为参与分类器学习的大量无线网络数据进行人工标记，具有易获取训练数据集和发现未知类型攻击行为的优势，同时该算法引入多视角余弦距离作为层次聚类中无线网络数据对象间相似性度量，使聚类结果更加合理，对网络数据行为的判定更加准确，在一定程度上降低了入侵检测的误检率。选用公开无线网络攻击数据集（AWID）进行实验，通过主成分分析法对实验数据集进行降维处理，很大程度上降低了入侵检测算法的时间复杂度。实验结果表明，与传统的无线网络入侵检测算法相比，提出的多视角层次聚类下的无线网络入侵检测算法在检测率、误检率和发现未知攻击类型等性能上都有显著提升。

关键词: 多视角, 层次聚类, 无线网络, 入侵检测, 主成分分析（PCA）

CLC Number:

TP393

DONG Xinyu, XIE Bin, ZHAO Xusheng, GAO Xinbao. Wireless Network Intrusion Detection Algorithm Based on Multiple Perspectives Hierarchical Clustering[J]. Journal of Frontiers of Computer Science and Technology, 2022, 16(12): 2752-2764.

董新玉, 解滨, 赵旭升, 高新宝. 多视角层次聚类下的无线网络入侵检测算法[J]. 计算机科学与探索, 2022, 16(12): 2752-2764.

Figures/Tables 26

Fig.1 Schematic diagram of wireless network intrusion detection process

Fig.2 Hierarchical clustering diagram

Fig.3 Hierarchical clustering process diagram

Table 1 Hierarchical clustering symbol expression

符号记法	描述
$n$	对象个数
$m$	属性个数
$k$	类个数
$x$	对象向量
$S = x 1, x 2, ⋯, x n$	对象集合
$S h = d 1, d 2, ⋯, d l$	基准点集合
$C = C 1, C 2, ⋯, C k$	聚类簇集合

Table 1 Hierarchical clustering symbol expression

符号记法	描述
$n$	对象个数
$m$	属性个数
$k$	类个数
$x$	对象向量
$S = x 1, x 2, ⋯, x n$	对象集合
$S h = d 1, d 2, ⋯, d l$	基准点集合
$C = C 1, C 2, ⋯, C k$	聚类簇集合

Fig.4 Measuring distance between data objects from origin view

Fig.5 Schematic diagram of arbitrary datum point in three-dimensional space

Table 2 Coordinates of 6 datum points

坐标 $X$	坐标 $Y$	坐标 $Z$
0.433 012 702	0.750	0
0.433 012 702	-0.375	0.649 519 053
0.433 012 702	-0.375	-0.649 519 053
-0.433 012 702	0.750	0
-0.433 012 702	-0.375	0.649 519 053
-0.433 012 702	-0.375	-0.649 519 053

Table 2 Coordinates of 6 datum points

坐标 $X$	坐标 $Y$	坐标 $Z$
0.433 012 702	0.750	0
0.433 012 702	-0.375	0.649 519 053
0.433 012 702	-0.375	-0.649 519 053
-0.433 012 702	0.750	0
-0.433 012 702	-0.375	0.649 519 053
-0.433 012 702	-0.375	-0.649 519 053

Fig.6 Schematic diagram of three-dimensional space datum point set

Table 3 Comparison of datum set size between full granularity and multi-perspective methods

维度 $n$	全粒度基准点个数	多视角基准点个数
3	8	6
4	16	12
5	32	24
6	64	48
7	128	96
8	256	192
9	512	384
10	1 024	768
11	2 048	1 536
12	4 096	3 072
13	8 192	6 144
14	16 384	12 288
15	32 768	24 576
16	65 536	49 152
17	131 072	98 304

Table 3 Comparison of datum set size between full granularity and multi-perspective methods

维度 $n$	全粒度基准点个数	多视角基准点个数
3	8	6
4	16	12
5	32	24
6	64	48
7	128	96
8	256	192
9	512	384
10	1 024	768
11	2 048	1 536
12	4 096	3 072
13	8 192	6 144
14	16 384	12 288
15	32 768	24 576
16	65 536	49 152
17	131 072	98 304

Table 4

成分	初始特征值			提取载荷平方和			旋转载荷平方和
成分	总计	方差百分比	累积	总计	方差百分比	累积	总计	方差百分比	累积
1	11.144	14.472	14.472	11.144	14.472	14.472	11.063	14.367	14.367
2	9.271	12.040	26.513	9.271	12.040	26.513	6.425	8.345	22.712
3	7.302	9.483	35.996	7.302	9.483	35.996	6.186	8.033	30.745
4	6.640	8.624	44.620	6.640	8.624	44.620	5.937	7.710	38.455
5	5.745	7.461	52.081	5.745	7.461	52.081	5.741	7.456	45.910
6	3.703	4.809	56.890	3.703	4.809	56.890	4.858	6.309	52.219
7	2.594	3.368	60.259	2.594	3.368	60.259	3.768	4.893	57.113
8	2.468	3.206	63.464	2.468	3.206	63.464	2.532	3.288	60.401
9	2.147	2.788	66.252	2.147	2.788	66.252	2.440	3.169	63.570
10	2.001	2.599	68.851	2.001	2.599	68.851	2.205	2.863	66.433
11	1.623	2.108	70.959	1.623	2.108	70.959	2.105	2.734	69.167
12	1.408	1.829	72.788	1.408	1.829	72.788	2.003	2.601	71.769
13	1.219	1.583	74.370	1.219	1.583	74.370	1.750	2.272	74.041
14	1.199	1.557	75.927	1.199	1.557	75.927	1.442	1.873	75.914
15	1.024	1.330	77.258	1.024	1.330	77.258	1.031	1.339	77.253
16	1.000	1.299	78.556	1.000	1.299	78.556	1.003	1.303	78.556

Table 5 Data distribution

数据类型	训练数据集	测试数据集
Nomal	1 633 190	530 785
Flooding	48 484	8 097
Impersonation	48 522	20 079
Injection	65 379	16 682
Total	1 795 575	575 643

Table 6 Test dataset of experiment 1, 2 and 3

数据集	正常数据/条	攻击数据/条	攻击行为/类
$H 1$	100	100	3
$H 2$	200	200	5
$H 3$	300	300	6
$H 4$	400	400	8
$H 5$	500	500	10
$H 6$	600	600	11
$H 7$	700	700	13
$H 8$	800	800	14
$H 9$	900	900	15
$H 10$	1 000	1 000	16

Table 6 Test dataset of experiment 1, 2 and 3

数据集	正常数据/条	攻击数据/条	攻击行为/类
$H 1$	100	100	3
$H 2$	200	200	5
$H 3$	300	300	6
$H 4$	400	400	8
$H 5$	500	500	10
$H 6$	600	600	11
$H 7$	700	700	13
$H 8$	800	800	14
$H 9$	900	900	15
$H 10$	1 000	1 000	16

Table 7 Test dataset of experiment 4

数据集	正常数据/条	攻击数据/条	攻击行为/类	未知攻击行为/类
$D 1$	100	100	2	1
$D 2$	200	200	3	2
$D 3$	300	300	3	3
$D 4$	400	400	4	4
$D 5$	500	500	5	5
$D 6$	600	600	5	6
$D 7$	700	700	6	7
$D 8$	800	800	6	8
$D 9$	900	900	6	9
$D 10$	1 000	1 000	6	10

Table 7 Test dataset of experiment 4

数据集	正常数据/条	攻击数据/条	攻击行为/类	未知攻击行为/类
$D 1$	100	100	2	1
$D 2$	200	200	3	2
$D 3$	300	300	3	3
$D 4$	400	400	4	4
$D 5$	500	500	5	5
$D 6$	600	600	5	6
$D 7$	700	700	6	7
$D 8$	800	800	6	8
$D 9$	900	900	6	9
$D 10$	1 000	1 000	6	10

Fig.7 Comparison of A C C in experiment 1

Fig.8 Comparison of F A R in experiment 1

Fig.9 Comparison of Recall in experiment 1

Fig.10 Comparison of F 1 in experiment 1

Fig.11 Comparison of A C C in experiment 2

Fig.12 Comparison of F A R in experiment 2

Fig.13 Comparison of R e c a l l in experiment 2

Fig.14 Comparison of F 1 in experiment 2

Fig.15 Comparison of A C C in experiment 3

Fig.16 Comparison of F A R in experiment 3

Fig.17 Comparison of R e c a l l in experiment 3

Fig.18 Comparison of F 1 in experiment 3

Fig.19 Detection rate of unknown attack type in experiment 4

References 16

[1]	王婷, 王娜, 崔运鹏, 等. 基于半监督学习的无线网络攻击行为检测优化方法[J]. 计算机研究与发展, 2020, 57(4): 791-802.
	WANG T, WANG N, CUI Y P, et al. The optimization method of wireless network attacks detection based on semi-supervised learning[J]. Journal of Computer Research and Development, 2020, 57(4): 791-802.
[2]	唐成华, 刘鹏程, 汤申生, 等. 基于特征选择的模糊聚类异常入侵行为检测[J]. 计算机研究与发展, 2015, 52(3): 718-728.
	TANG C H, LIU P C, TANG S S, et al. Anomaly intrusion behavior detection based on fuzzy clustering and features selsection[J]. Journal of Computer Research and Develop-ment, 2015, 52(3): 718-728.
[3]	庄池杰, 张斌, 胡军, 等. 基于无监督学习的电力用户异常用电模式检测[J]. 中国电机工程学报, 2016, 36(2): 379-387.
	ZHUANG C J, ZHANG B, HU J, et al. Anomaly detection for power consumption patterns based on unsupervised learning[J]. Proceedings of the CSEE, 2016, 36(2): 379-387.
[4]	JIANG S Y, SONG X Y, WANG H, et al. A clustering-based method for unsupervised intrusion detections[J]. Pattern Recognition Letters, 2005, 27(7): 802-810. DOI URL
[5]	刘卫国, 张志良. 一种全部属性聚类和特征属性聚类相结合的无监督异常检测模型[J]. 铁道学报, 2010, 32(5): 59-64.
	LIU W G, ZHANG Z L. Unsupervised anomaly detection model combining total attributes clustering and feature attributes[J]. Journal of the China Railway Society, 2010, 32(5): 59-64.
[6]	周亚建, 徐晨, 李继国. 基于改进CURE聚类算法的无监督异常检测方法[J]. 通信学报, 2010, 31(7): 18-23. DOI
	ZHOU Y J, XU C, LI J G. Unsupervised anomaly detection method based on improved CURE clustering algorithm[J]. Journal on Communications, 2010, 31(7): 18-23.
[7]	吴金娥, 王若愚, 段倩倩, 等. 基于反向k近邻过滤异常的群数据异常检测[J]. 上海交通大学学报, 2021, 55(5): 598-606.
	WU J E, WANG R Y, DUAN Q Q, et al. Collective data anomaly detection based on reverse k-nearest neighbor filte-ring[J]. Journal of Shanghai Jiaotong University, 2021, 55(5): 598-606.
[8]	解滨, 董新玉, 梁皓伟. 基于三支动态阈值K-means聚类的入侵检测算法[J]. 郑州大学学报(理学版), 2020, 52(2): 64-70.
	XIE B, DONG X Y, LIANG H W. An algorithm of intru-sion detection based on three-way dynamic threshold K-means clustering[J]. Journal of Zhengzhou University (Natural Science Edition), 2020, 52(2): 64-70.
[9]	MANNING C D, RAGHAVAN P, SCHUTZE H. An intro-duction to information retrieval[M]. New York: Cambridge University Press, 2009.
[10]	李飞江, 成红红, 钱宇华. 全粒度聚类算法[J]. 南京大学学报(自然科学), 2014, 50(4): 505-516.
	LI F J, CHENG H H, QIAN Y H. Whole-granulation clus-ter algorithm[J]. Journal of Nanjing University (Natural Science), 2014, 50(4): 505-516.
[11]	田有亮, 吴雨龙, 李秋贤. 基于信息论的入侵检测最佳响应方案[J]. 通信学报, 2020, 41(7): 121-130. DOI
	TIAN Y L, WU Y L, LI Q X. Optimum response scheme of intrusion detection based on information theory[J]. Journal on Communications, 2020, 41(7): 121-130. DOI
[12]	周晨曦, 梁循, 齐金山. 基于约束动态更新的半监督层次聚类算法[J]. 自动化学报, 2015, 41(7): 1253-1263.
	ZHOU C X, LIANG X, QI J S. A semi-supervised agglo-merative hierarchical clustering method based on dyna-mically updating constraints[J]. Acta Automatica Sinica, 2015, 41(7): 1253-1263.
[13]	关健, 刘大昕. 基于主成分分析的无监督异常检测[J]. 计算机研究与发展, 2004, 41(9): 1474-1480.
	GUAN J, LIU D X. Unsupervised anomaly detection based on principal components analysis[J]. Journal of Computer Research and Development, 2004, 41(9): 1474-1480.
[14]	KOLIAS C, KAMBOURAKIS G, STAVROU A, et al. Intru-sion detection in 802.11 networks: empirical evaluation of threats and a public dataset[J]. IEEE Communications Sur-veys & Tutorials, 2016, 18(1): 184-208.
[15]	贺亮, 徐正国, 李赟, 等. 非数值化特征的条件概率区域划分(CZT)编码方法[J]. 计算机应用研究, 2020, 37(5): 1400-1405.
	HE L, XU Z G, LI Y, et al. Conditional-probability zone transformation coding for categorical features[J]. Applica-tion Research of Computers, 2020, 37(5): 1400-1405.
[16]	陈翔, 王莉萍, 顾庆, 等. 跨项目软件缺陷预测方法研究综述[J]. 计算机学报, 2018, 41(1): 254-274.
	CHEN X, WANG L P, GU Q, et al. A survey on cross-project software defect prediction methods[J]. Chinese Journal of Computers, 2018, 41(1): 254-274.

Wireless Network Intrusion Detection Algorithm Based on Multiple Perspectives Hierarchical Clustering

多视角层次聚类下的无线网络入侵检测算法

RichHTML

PDF

Knowledge

Abstract

Cite this article

share this article

Figures/Tables 26

References 16

Related Articles 15

Recommended Articles

Metrics

[1]	XU Jia, MO Xiaokun, YU Ge, LYU Pin, WEI Tingting. SQL-Detector: SQL Plagiarism Detection Technique Based on Coding Features [J]. Journal of Frontiers of Computer Science and Technology, 2022, 16(9): 2030-2040.
[2]	WU Xiaodong, LIU Jinghao, JIN Jie, MAO Siping. DNN Intrusion Detection Model Based on DT and PCA [J]. Journal of Frontiers of Computer Science and Technology, 2021, 15(8): 1450-1458.
[3]	XU Xudong, ZHANG Zhixiang, ZHANG Xian. Message Clustering Method for Private Binary Protocol [J]. Journal of Frontiers of Computer Science and Technology, 2020, 14(6): 958-965.
[4]	YANG Jie, TANG Yachun, TAN Daojun, LIU Xiaobing. Intrusion Detection Method of Multi-channel Autoencoder Deep Learning [J]. Journal of Frontiers of Computer Science and Technology, 2020, 14(12): 2050-2060.
[5]	WANG Xiaodong, ZHAO Yining, XIAO Haili, WANG Xiaoning, CHI Xuebin. Research on Anomaly Detection System of Online Multi-node Log Flow [J]. Journal of Frontiers of Computer Science and Technology, 2020, 14(11): 1828-1837.
[6]	WAN Jing, WU Fan, HE Yunbin, LI Song. Clustering Algorithm for High-Dimensional Data Under New Dimensionality Reduc-tion Criteria [J]. Journal of Frontiers of Computer Science and Technology, 2020, 14(1): 96-107.
[7]	CHEN Hong, CHEN Jianhu, XIAO Chenglong, WAN Guangxue, XIAO Zhenjiu. Intrusion Detection Method of Multiple Classifiers Under Deep Learning Model [J]. Journal of Frontiers of Computer Science and Technology, 2019, 13(7): 1123-1133.
[8]	DING Panpan, SONG Guozhi, ZHAO Chenglong, ZHOU Yijie. Research on Distribution of Wireless Routers in Hybrid Three-Dimensional Wireless Network-on-Chip [J]. Journal of Frontiers of Computer Science and Technology, 2019, 13(11): 1864-1872.
[9]	LIANG Lingyu, SUN Mingkun, HE Wei, LI Fengrong. Head Pose Estimation Method of Bagging-SVM Integrated Classifier [J]. Journal of Frontiers of Computer Science and Technology, 2019, 13(11): 1935-1944.
[10]	WANG Yi, FENG Xiaonian, QIAN Tieyun, ZHU Hui3, ZHOU Jing. CNN and LSTM Deep Network Based Intrusion Detection for Malicious Users [J]. Journal of Frontiers of Computer Science and Technology, 2018, 12(4): 575-585.
[11]	LIU Chao, XU Yabin, WU Zhuang. Method for Rapid Detecting Micro-Blog Communities [J]. Journal of Frontiers of Computer Science and Technology, 2015, 9(9): 1100-1107.
[12]	ZHANG Yonghui, LI Chuan, TANG Changjie, LI Yanmei. Information Networks Community Trend Prediction Based on Structure Analysis [J]. Journal of Frontiers of Computer Science and Technology, 2015, 9(4): 403-409.
[13]	CHEN Lijuan, LIU Zhihong, ZHANG Teng, TIAN Senping, LU Wei. Secure Communication in Stochastic Wireless Networks with the Aid of Jamming [J]. Journal of Frontiers of Computer Science and Technology, 2015, 9(3): 338-351.
[14]	ZHANG Chengbo, WANG Xingwei, HUANG Min. ABC Supported Handoff Decision Scheme with Multi-Objective Genetic Optimization [J]. Journal of Frontiers of Computer Science and Technology, 2013, 7(8): 704-717.
[15]	CAI Zhiping, LIU Shuhao, WANG Han, CAO Jienan, XU Ming. High Performance Parallel Intrusion Detection Algorithms and Framework [J]. Journal of Frontiers of Computer Science and Technology, 2013, 7(4): 289-303.