Research on Software Defect Prediction Models Combining Static Analysis Warnings

doi:10.3778/j.issn.1673-9418.2402003

Abstract

Abstract: Static analysis warnings, as an important software quality metric, are widely used to identify potential violations in the source code. Recent studies have shown that static analysis warnings are applied in code smell detection and just-in-time defect prediction, but they are not involved in early projects when commit logs are lacking. To address this issue, this paper utilizes warning information from three popular static analysis tools, and combines it into the existing defect prediction model. The paper creates a new metric covering both software development and code maintainability, and explores the potential relationship between static analysis warnings and defects. The paper also investigates the impact of combining warnings on the performance of software defect prediction models, and evaluates their influence in cross-project scenarios. The experimental results indicate that the quantity of warnings is closely related to the distribution of defects, showing a positive correlation. This suggests that warnings have significant potential in software defect prediction models, and the reported warning information in datasets with defects is often related to coding standards. After combining warnings, the defect prediction model improves the average precision by 1.4% to 14.7%, the average recall by 0.2% to 2.4%, the average F1 by 0.3% to 3.0%, and the average AUC by 0.2% to 1.4% in different projects. In cross-project scenarios, the metric CODE+SAW_VIF provides the best-performing defect prediction model, and combining static analysis warnings enhances the model’s ability to identify defects.

Key words: software defects, static analysis tools, static analysis warnings, code metrics, cross-project scenario prediction

摘要： 静态分析警告作为一种重要的软件质量指标，被广泛用于识别源代码中潜在的违规问题。近期的研究表明，静态分析警告在代码异味检测和即时缺陷预测中有所应用，但有关项目早期缺少提交修改记录的情况没有涉及。针对上述问题，利用三种流行的静态分析工具的警告信息，在原有的缺陷预测模型中融合静态分析警告这个新的度量，构建一个涵盖软件开发和代码可维护性的缺陷预测模型，并探究静态分析警告与缺陷的潜在关系，融合警告对软件缺陷预测模型性能的影响以及在跨项目场景中的影响。实验结果表明，警告数量往往与缺陷分布密切相关，呈现正相关的关系，即警告这一度量在软件缺陷预测模型中有相当大的潜力，并且在有缺陷数据中报告的警告信息往往与编码规范相关；融合警告之后，缺陷预测模型在各项目上的平均精度提高1.4%~14.7%，平均召回率提高0.2%~2.4%，平均F1提高0.3%~3.0%，平均AUC提高0.2%~1.4%。在跨项目场景中，CODE+SAW_VIF度量提供了最佳性能的缺陷预测模型，融合静态分析警告能够提升模型识别缺陷的性能。

关键词: 软件缺陷, 静态分析工具, 静态分析警告, 代码度量, 跨项目场景预测

WU Haitao, MA Jingyue, GAO Jianhua. Research on Software Defect Prediction Models Combining Static Analysis Warnings[J]. Journal of Frontiers of Computer Science and Technology, 2025, 19(3): 818-834.

吴海涛, 马景悦, 高建华. 融合静态分析警告的软件缺陷预测模型及其应用研究[J]. 计算机科学与探索, 2025, 19(3): 818-834.

References

[1] WAN Z Y, XIA X, HASSAN A E, et al. Perceptions, expectations, and challenges in defect prediction[J]. IEEE Transactions on Software Engineering, 2020, 46(11): 1241-1266.
[2] KAMEI Y, SHIHAB E. Defect prediction: accomplishments and future challenges[C]//Proceedings of the 2016 IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering. Piscataway: IEEE, 2016: 33-45.
[3] PALOMBA F, ZANONI M, FONTANA F A, et al. Toward a smell-aware bug prediction model[J]. IEEE Transactions on Software Engineering, 2019, 45(2): 194-218.
[4] TRAUTSCH A, HERBOLD S, GRABOWSKI J. Static source code metrics and static analysis warnings for fine-grained just-in-time defect prediction[C]//Proceedings of the 2020 IEEE International Conference on Software Maintenance and Evolution. Piscataway: IEEE, 2020: 127-138.
[5] KAMEI Y, SHIHAB E, ADAMS B, et al. A large-scale empirical study of just-in-time quality assurance[J]. IEEE Transactions on Software Engineering, 2013, 39(6): 757-773.
[6] KAMEI Y, FUKUSHIMA T, MCINTOSH S, et al. Studying just-in-time defect prediction using cross-project models[J]. Empirical Software Engineering, 2016, 21(5): 2072-2106.
[7] GONG L N, RAJBAHADUR G K, HASSAN A E, et al. Revisiting the impact of dependency network metrics on software defect prediction[J]. IEEE Transactions on Software Engineering, 2022, 48(12): 5030-5049.
[8] REBRO D A, CHREN S, ROSSI B, et al. Source code metrics for software defects prediction[C]//Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. New York: ACM, 2023: 1469-1472.
[9] RAHMAN F, DEVANBU P. How, and why, process metrics are better[C]//Proceedings of the 2013 35th International Conference on Software Engineering. Piscataway: IEEE, 2013: 432-441.
[10] PECORELLI F, LUJAN S, LENARDUZZI V, et al. On the adequacy of static analysis warnings with respect to code smell prediction[J]. Empirical Software Engineering, 2022, 27(3): 64.
[11] DANGEL A, DODERO J, FOURNIER C, et al. PMD[EB/OL]. (2023-05-30) [2023-08-22]. https://pmd.github.io/.
[12] SILKENSEN E, KöDDERITZSCH L, MANCUSO N, et al. CheckStyle[EB/OL]. (2023-06-25) [2023-08-22]. https://checkstyle.sourceforge.io/.
[13] PUGH B, LOSKUTOV A. FindBugs[EB/OL]. (2015-03-06) [2023-08-22]. https://findbugs.sourceforge.net/index.html.
[14] FOWLER M, BECK K. Refactoring: improving the design of existing code[M]. 2nd ed. New York: Addison Wesley Professional, 2018: 75-87.
[15] JORAYEVA M, AKBULUT A, CATAL C, et al. Machine learning-based software defect prediction for mobile applications: a systematic literature review[J]. Sensors, 2022, 22(7): 2551.
[16] ARTEAU P, LOSKUTOV A, DODERO J, et al. SpotBugs[EB/OL]. (2023-06-27) [2023-09-22]. https://spotbugs.github.io/.
[17] 宫丽娜, 姜淑娟, 姜丽. 软件缺陷预测技术研究进展[J]. 软件学报, 2019, 30(10): 3090-3114.
GONG L N, JIANG S J, JIANG L. Research progress of software defect prediction[J]. Journal of Software, 2019, 30(10): 3090-3114.
[18] 田笑, 常继友, 张弛, 等. 开源软件缺陷预测方法综述[J]. 计算机研究与发展, 2023, 60(7): 1467-1488.
TIAN X, CHANG J Y, ZHANG C, et al. Survey of open-source software defect prediction method[J]. Journal of Computer Research and Development, 2023, 60(7): 1467-1488.
[19] 陈翔, 王莉萍, 顾庆, 等. 跨项目软件缺陷预测方法研究综述[J]. 计算机学报, 2018, 41(1): 254-274.
CHEN X, WANG L P, GU Q, et al. A survey on cross-project software defect prediction methods[J]. Chinese Journal of Computers, 2018, 41(1): 254-274.
[20] HOSSEINI S, TURHAN B, GUNARATHNA D. A systematic literature review and meta-analysis on cross project defect prediction[J]. IEEE Transactions on Software Engineering, 2019, 45(2): 111-147.
[21] XIA X, LO D, PAN S J, et al. HYDRA: massively compositional model for cross-project defect prediction[J]. IEEE Transactions on Software Engineering, 2016, 42(10): 977-998.
[22] SON L H, PRITAM N, KHARI M, et al. Empirical study of software defect prediction: a systematic mapping[J]. Symmetry, 2019, 11(2): 212.
[23] PASCARELLA L, PALOMBA F, BACCHELLI A. Fine-grained just-in-time defect prediction[J]. Journal of Systems and Software, 2019, 150: 22-36.
[24] HALSTEAD M H. Elements of Software science (operating and programming systems series)[M]. New York: Elsevier Science Inc., 1977: 31-55.
[25] MCCABE T J. A complexity measure[J]. IEEE Transactions on Software Engineering, 1976, 2(4): 308-320.
[26] KHOSHGOFTAAR S. Improving code churn predictions during the system test and maintenance phases[C]//Proceedings of the 1994 International Conference on Software Maintenance. Piscataway: IEEE, 1994: 58-67.
[27] NAGAPPAN N, BALL T, NAGAPPAN N, et al. Use of relative code churn measures to predict system defect density[C]//Proceedings of the 27th International Conference on Software Engineering. New York: ACM, 2005: 284-292.
[28] CHIDAMBER S R, KEMERER C F. A metrics suite for object oriented design[J]. IEEE Transactions on Software Engineering, 1994, 20(6): 476-493.
[29] SciTools.Understand[EB/OL]. (2021-10-01) [2023-09-22]. https://support.scitools.com/support/solutions/articles/70000582289-metrics-overview.
[30] MEHRPOUR S, LATOZA T D. Can static analysis tools find more defects?[J]. Empirical Software Engineering, 2023, 28(1): 5.
[31] LENARDUZZI V, PECORELLI F, SAARIMAKI N, et al. A critical comparison on six static analysis tools: detection, agreement, and precision[J]. Journal of Systems and Software, 2023, 198: 111575.
[32] JOHNSON B, SONG Y, MURPHY-HILL E, et al. Why don’t software developers use static analysis tools to find bugs? [C]//Proceedings of the 2013 35th International Conference on Software Engineering. Piscataway: IEEE, 2013: 672-681.
[33] YERRAMREDDY S, MORDAHL A, KOC U, et al. An empirical assessment of machine learning approaches for triaging reports of static analysis tools[J]. Empirical Software Engineering, 2023, 28(2): 28.
[34] YEDIDA R, KANG H J, TU H, et al. How to find actionable static analysis warnings: a case study with FindBugs[J]. IEEE Transactions on Software Engineering, 2023, 49(4): 2856-2872.
[35] TRAUTSCH A, HERBOLD S, GRABOWSKI J. A longitudinal study of static analysis warning evolution and the effects of PMD on software quality in Apache open source projects[J]. Empirical Software Engineering, 2020, 25(6): 5137-5192.
[36] YATISH S, JIARPAKDEE J, THONGTANUNAM P, et al. Mining software defects: should we consider affected releases?[C]//Proceedings of the 2019 IEEE/ACM 41st International Conference on Software Engineering. Piscataway: IEEE, 2019: 654-665.
[37] OMRI S, SINZ C. Deep learning for software defect prediction: a survey[C]//Proceedings of the 42nd IEEE/ACM International Conference on Software Engineering Workshops. New York: ACM, 2020: 209-214.
[38] WANG S M, HUANG L G, GAO A M, et al. Machine/deep learning for software engineering: a systematic literature review[J]. IEEE Transactions on Software Engineering, 2022, 49(3): 1188-1231.
[39] ESTEVES G, FIGUEIREDO E, VELOSO A, et al. Understanding machine learning software defect predictions[J]. Automated Software Engineering, 2020, 27(3/4): 369-392.
[40] PRABHA C L, SHIVAKUMAR N. Software defect prediction using machine learning techniques[C]//Proceedings of the 2020 4th International Conference on Trends in Electronics and Informatics. Piscataway: IEEE, 2020: 728-733.
[41] TANTITHAMTHAVORN C, MCINTOSH S, HASSAN A E, et al. An empirical comparison of model validation techniques for defect prediction models[J]. IEEE Transactions on Software Engineering, 2017, 43(1): 1-18.
[42] TANTITHAMTHAVORN C, MATSUMOTO K, HASSAN A E. Towards a better understanding of the impact of experimental components on defect prediction modelling[C]//Proceedings of the 2016 IEEE/ACM 38th International Conference on Software Engineering Companion. Piscataway: IEEE, 2016: 867-870.
[43] MANN H B, WHITNEY D R. On a test of whether one of two random variables is stochastically larger than the other[J]. The Annals of Mathematical Statistics, 1947, 18(1): 50-60.
[44] CLIFF N. Dominance statistics: ordinal analyses to answer ordinal questions[J]. Psychological Bulletin, 1993, 114(3): 494-509.
[45] FRIEDMAN M. A comparison of alternative tests of significance for the problem of m rankings[J]. The Annals of Mathematical Statistics, 1940, 11(1): 86-92.
[46] NEMENYI P B. Distribution-free multiple comparisons[D]. Princeton: Princeton University, 1963.
[47] WAN Z Y, XIA X, LO D, et al. Smart contract security: a practitioners􀆳 perspective[C]//Proceedings of the 2021 IEEE/ ACM 43rd International Conference on Software Engineering. Piscataway: IEEE, 2021: 1410-1422.
[48] DEMŠAR J. Statistical comparisons of classifiers over multiple data sets[J]. Journal of Machine Learning Research, 2006, 7: 1-30.