计算机科学与探索 ›› 2022, Vol. 16 ›› Issue (8): 1792-1799.DOI: 10.3778/j.issn.1673-9418.2102048

• 网络与信息安全 • 上一篇    下一篇

融合行为模式的Android恶意代码检测方法

杨吉云+(), 范佳文1, 周洁1, 高凌云2   

  1. 1. 重庆大学 计算机学院,重庆 400044
    2. 中国石油集团测井有限公司西南分公司,重庆 400030
  • 收稿日期:2021-02-22 修回日期:2021-06-16 出版日期:2022-08-01 发布日期:2021-06-23
  • 通讯作者: +E-mail: yangjy@cqu.edu.cn
  • 作者简介:杨吉云(1975—),男,重庆人,博士,副教授,主要研究方向为恶意代码检测、隐私保护等。
    范佳文(1998—),男,四川广元人,硕士研究生,主要研究方向为信息安全、隐私保护等。
    周洁(1995—),女,重庆垫江人,硕士研究生,主要研究方向为Android恶意代码检测、机器学习等。
    高凌云(1984—),男,四川泸州人,硕士研究生,工程师,主要研究方向为钻井仪器仪表、井场信息化等。
  • 基金资助:
    重庆市技术创新与应用发展专项(cstc2019jscx-msxmX0077)

Android Malware Detection Method Based on Behavior Pattern

YANG Jiyun+(), FAN Jiawen1, ZHOU Jie1, GAO Lingyun2   

  1. 1. School of Computer, Chongqing University, Chongqing 400044, China
    2. China National Petroleum Corporation Logging Company Limited Southwest Branch, Chongqing 400030, China
  • Received:2021-02-22 Revised:2021-06-16 Online:2022-08-01 Published:2021-06-23
  • About author:YANG Jiyun, born in 1975, Ph.D., associate professor. His research interests include malware detection, privacy protection, etc.
    FAN Jiawen, born in 1998, M.S. candidate. His research interests include information security, privacy protection, etc.
    ZHOU Jie, born in 1995, M.S. candidate. Her research interests include Android malware detection, machine learning, etc.
    GAO Lingyun, born in 1984, M.S. candidate, engineer. His research interests include oil well instruments, informatization in oil well, etc.
  • Supported by:
    the Technological Innovation and Application Projects of Chongqing(cstc2019jscx-msxmX0077)

摘要:

基于API调用序列的Android恶意代码检测方法大多使用N-gram和Markov Chain来构建行为特征实现恶意代码检测,但这类方法构造的特征序列长度受限且包含不相关的调用序列,检测精度不高。提出了一种基于行为模式的Android恶意代码检测方法。首先,通过调用序列约简和调用序列合并,提取了最长敏感API调用序列;然后,定义了加权支持度,在此基础上提出了改进的序列模式挖掘算法,挖掘不同类别样本中具有高区分度的序列模式作为分类特征;最后,使用不同的机器学习算法构建分类器实现恶意代码检测。实验结果表明,提出的方法在Android恶意代码检测中的精确度达到了96.11%,比基于API调用数据的两种同类恶意代码检测方法分别提高了4.60个百分点和2.11个百分点。因此,提出的方法能有效检测Android恶意代码。

关键词: 恶意代码检测, API调用序列, 行为模式, 序列模式挖掘

Abstract:

Most Android malware detection methods based on API (application programming interface) call sequences use N-gram and Markov chain to construct application behavior features. However, the feature sequences constructed by such approaches are of limited length and contain the call sequences unrelated to the malicious behavior, resulting in low detection accuracy. This paper proposes a method of detecting Android malware based on behavior pattern. Firstly, the longest sensitive API call sequence is extracted by call sequence reduction and call sequence merging. Then, the weighted support is defined, and an improved sequence pattern mining algorithm is proposed to mine sequence patterns with high discrimination from different categories of samples as classification features. Finally, different machine learning algorithms are used to construct classifier to detect malware. Experimental results show that the precision of the proposed method in Android malicious code detection reaches 96.11%, which is higher than the other two malicious code detection methods based on API call data, improved by 4.60 percentage points and 2.11 percentage points respectively. Therefore, the proposed method can effectively detect Android malicious code.

Key words: malware detection, API call sequences, behavior pattern, sequence pattern mining

中图分类号: