超参数优化对跨版本缺陷预测影响的实证研究

doi:10.3778/j.issn.1673-9418.2209087

摘要/Abstract

摘要： 在机器学习领域，超参数是影响模型性能的关键因素之一。已有研究表明，超参数优化能够显著提升版本内缺陷预测和跨项目缺陷预测性能，而对跨版本缺陷预测性能的影响尚不明确。选取五种经典缺陷预测模型（决策树、K-近邻、随机森林、支持向量机和多层感知机）以及四种常用超参数优化算法（基于TPE的贝叶斯优化、基于SMAC的贝叶斯优化、随机搜索和模拟退火），在PROMISE数据集上进行实验，探究超参数优化对跨版本缺陷预测性能的影响。研究结果表明：决策树、K-近邻和多层感知机模型超参数优化后，跨版本缺陷预测AUC值得到显著提升；超参数优化后的模型仍保持与默认超参数设置下相当的稳定性；除了较为复杂的多层感知机模型，其余模型超参数优化的时间平均为1~2 min，在跨版本缺陷预测实验中优化模型超参数是可行的。上述结果表明，跨版本缺陷预测应考虑优化模型超参数以提升预测性能。

关键词: 软件缺陷预测, 跨版本缺陷预测, 超参数优化

Abstract: In the field of machine learning, hyperparameters are one of the key factors that affect prediction performance. Previous studies have shown that optimizing hyperparameters can improve the performance of inner-version defect prediction and cross-project defect prediction, but the impact on the performance of cross-version defect prediction is unclear. This paper chooses five classical defect prediction models (decision tree, K-nearest neighbors, random forests, support vector machine, and multi-layer perceptron) and four common hyperparameter optimization algorithms (Bayesian optimization based on TPE, Bayesian optimization based on SMAC, random search, and simulated annealing). An empirical study is conducted on PROMISE dataset to explore the influence of optimizing hyperparameters on the performance of cross-version defect prediction. The results indicate that: firstly, there is an obvious improvement in the AUC of cross-version defect prediction after optimizing the decision tree, K-nearest neighbors and multi-layer perceptron models; secondly, the optimal models still maintain the same stability as the default hyperparametric models; thirdly, hyperparameter optimization takes 1 to 2 minutes for all models on average except the complicated multi-layer perceptron model and it is feasible to optimize the hyperparameter of model in cross-version defect prediction experiment. The above results indicate that the hyperparameter optimization of the model should be considered in the process of cross-version defect prediction to improve its performance.

Key words: software defect prediction, cross-version defect prediction, hyperparameter optimization

韩惠, 于巧, 祝义. 超参数优化对跨版本缺陷预测影响的实证研究[J]. 计算机科学与探索, 2023, 17(12): 3052-3064.

HAN Hui, YU Qiao, ZHU Yi. Impact of Hyperparameter Optimization on Cross-Version Defect Prediction: An Empirical Study[J]. Journal of Frontiers of Computer Science and Technology, 2023, 17(12): 3052-3064.

参考文献

[1] 王青, 伍书剑, 李明树. 软件缺陷预测技术[J]. 软件学报, 2008, 19(7): 1565-1580.
WANG Q, WU S J, LI M S. Software defect prediction[J]. Journal of Software, 2008, 19(7): 1565-1580.
[2] 陈翔, 顾庆, 刘望舒, 等. 静态软件缺陷预测方法研究[J]. 软件学报, 2016, 27(1): 1-25.
CHEN X, GU Q, LIU W S, et al. Survey of static software defect prediction[J]. Journal of Software, 2016, 27(1): 1-25.
[3] 陈翔, 王莉萍, 顾庆, 等. 跨项目软件缺陷预测方法研究综述[J]. 计算机学报, 2018, 41(1): 254-274.
CHEN X, WANG L P, GU Q, et al. A survey on cross-project software defect prediction methods[J]. Chinese Journal of Computers, 2018, 41(1): 254-274.
[4] LESSMANN S, BAESENS B, MUES C, et al. Bench-marking classification models for software defect prediction: a proposed framework and novel findings[J]. IEEE Transac-tions on Software Engineering, 2008, 34(4): 485-496.
[5] ZIMMERMANN T, NAGAPPAN N. Predicting defects using network analysis on dependency graphs[C]//Procee-dings of the 30th International Conference on Software Engineering, Leipzig, May 10-18, 2008. New York: ACM, 2008: 531-540.
[6] LEE T, NAM J, HAN D, et al. Micro interaction metrics for defect prediction[C]//Proceedings of the 19th ACM SIGSOFT Symposium and the 13th European Conference on Foun-dations of Software Engineering, Szeged, Sep 5-9, 2011. New York: ACM, 2011: 311-321.
[7] SUN Z, SONG Q, ZHU X. Using coding-based ensemble learning to improve software defect prediction[J]. IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews, 2012, 42(6): 1806-1817.
[8] JING X Y, YING S, ZHANG Z W, et al. Dictionary learning based software defect prediction[C]//Proceedings of the 36th International Conference on Software Engineering, Hyderabad, May 3-Jun 7, 2014. New York: ACM, 2014: 414-423.
[9] ZHANG J, WU J, CHEN C, et al. CDS: a cross-version soft-ware defect prediction model with data selection[J]. IEEE Access, 2020, 8: 110059-110072.
[10] 吴方君. 静态软件缺陷预测研究进展[J]. 计算机科学与探索, 2019, 13(10): 1621-1637.
WU F J. Research progress of static software defect predic-tion[J]. Journal of Frontiers of Computer Science and Tech-nology, 2019, 13(10): 1621-1637.
[11] TANTITHAMTHAVORN C, MATSUMOTO K, HASSAN A E. Towards a better understanding of the impact of expe-rimental components on defect prediction modelling[C]//Proceedings of the 38th International Conference on Soft-ware Engineering Companion, Austin, May 14-22, 2016. New York: ACM, 2016: 867-870.
[12] FU W, MENZIES T, SHEN X. Tuning for software analy-tics: is it really necessary?[J]. Information and Software Technology, 2016, 76: 135-146.
[13] EIBE I W, WITTEN I H, FRANK E, et al. Weka: practical machine learning tools and techniques with java implemen-tations[J]. ACM SIGMOD Record, 1999, 31(1): 76-77.
[14] HIGHAM D J, HIGHAM N J. MATLAB guide[M]. Phila-delphia: Society for Industrial and Applied Mathematics, 2016.
[15] TOSUN A, BENER A. Reducing false alarms in software defect prediction by decision threshold optimization[C]//Proceedings of the 3rd International Symposium on Em-pirical Software Engineering and Measurement, Florida, Oct 15-16, 2009. Washington: IEEE Computer Society, 2009: 477-480.
[16] JIANG Y, CUKIC B, MENZIES T. Can data transformation help in the detection of fault-prone modules?[C]//Procee-dings of the 2008 Workshop on Defects in Large Software Systems, Seattle, Jul 20, 2008. New York: ACM, 2008: 16-20.
[17] YU Q, ZHU Y, HAN H, et al. Evolutionary measures for object-oriented projects and impact on the performance of cross-version defect prediction[C]//Proceedings of the 2022 Asia-Pacific Symposium on Internetware, Hohhot, Jun 11-12, 2022. New York: ACM, 2022: 192-201.
[18] TANTITHAMTHAVORN C, MCINTOSH S, HASSAN A E, et al. Automated parameter optimization of classification techniques for defect prediction models[C]//Proceedings of the 38th International Conference on Software Engineering, Austin, May 14-22, 2016. New York: ACM, 2016: 321-332.
[19] LI K, XIANG Z L, CHEN T, et al. Understanding the auto-mated parameter optimization on transfer learning for cross-project defect prediction: an empirical study[C]//Procee-dings of the ACM/IEEE 42nd International Conference on Software Engineering, Seoul, Jun 27-Jul 19, 2020. New York: ACM, 2020: 566-577.
[20] LI Y, SHAMI A. On hyperparameter optimization of machine learning algorithms: theory and practice[J]. Neurocompu-ting, 2020, 415: 295-316.
[21] HERTEL L, BALDI P, GILLEN D L. Reproducible hyper-parameter optimization[J]. Journal of Computational and Graphical Statistics, 2022, 31(1): 84-99.
[22] ZHANG B, RAJAN R, PINEDA L, et al. On the importance of hyperparameter optimization for model-based reinforce-ment learning[C]//Proceedings of the 24th International Conference on Artificial Intelligence and Statistics, Apr 13-15, 2021: 4015-4023.
[23] TANTITHAMTHAVORN C, MCINTOSH S, HASSAN A E, et al. The impact of automated parameter optimization on defect prediction models[J]. IEEE Transactions on Software Engineering, 2018, 45(7): 683-711.
[24] QU Y, CHEN X, ZHAO Y, et al. Impact of hyper parameter optimization for cross-project software defect prediction[J]. International Journal of Performability Engineering, 2018, 14(6): 1291-1299.
[25] PEDREGOSA F, VAROQUAUX G, GRAMFORT A, et al. Scikit-learn: machine learning in Python[J]. Journal of Mac-hine Learning Research, 2011, 12: 2825-2830.
[26] QUINLAN J R. Induction of decision trees[J]. Machine Learning, 1986, 1(1): 81-106.
[27] COVER T, HART P. Nearest neighbor pattern classification[J]. IEEE Transactions on Information Theory, 1967, 13(1): 21-27.
[28] BREIMAN L. Random forests[J]. Machine Learning, 2001, 45(1): 5-32.
[29] NOBLE W S. What is a support vector machine?[J]. Nature Biotechnology, 2006, 24(12): 1565-1567.
[30] ROSENBLATT F. Principles of neurodynamics: perceptrons and the theory of brain mechanisms[R]. New York: Cornell Aeronautical Lab Inc Buffalo, 1961.
[31] VICTORIA A H, MARAGATHAM G. Automatic tuning of hyperparameters using Bayesian optimization[J]. Evolving Systems, 2021, 12(1): 217-223.
[32] BERGSTRA J, BARDENET R, BENGIO Y, et al. Algori-thms for hyper-parameter optimization[C]//Advances in Neural Information Processing Systems 24, Granada, Dec 12-14, 2011: 2546-2554.
[33] SHLESINGER M F. Random searching[J]. Journal of Physics A-Mathematical and Theoretical, 2009, 42(43): 434001.
[34] VAN P, AARTS E. Simulated annealing: theory and applica-tions[M]. Norwell: Dordrecht Boston, 1987.
[35] BERGSTRA J, YAMINS D, COX D D. Hyperopt: a Python library for optimizing the hyperparameters of machine lear-ning algorithms[C]//Proceedings of the 12th Python in Science Conference, Austin, Jun 24-29, 2013: 13-19.
[36] BORYSSENKO A, HERSCOVICI N. Machine learning for multiobjective evolutionary optimization in Python for EM problems[C]//Proceedings of the 2018 IEEE International Symposium on Antennas and Propagation & USNC/URSI National Radio Science Meeting, Boston, Jul 8-13, 2018. Pis-cataway: IEEE, 2018: 541-542.
[37] LINDAUER M, EGGENSPERGER K, FEURER M, et al. SMAC3: a versatile Bayesian optimization package for hyper-parameter optimization[J]. Jorunal of Machine Learning Research, 2022, 23: 1-9.
[38] JURECZKO M, MADEYSKI L. Towards identifying soft-ware project clusters with regard to defect prediction[C]//Proceedings of the 6th International Conference on Predic-tive Models in Software Engineering, Timisoara, Sep 12-13, 2010. New York: ACM, 2010: 1-10.
[39] HUANG J, LING C X. Using AUC and accuracy in evalua-ting learning algorithms[J]. IEEE Transactions on Know-ledge and Data Engineering, 2005, 17(3): 299-310.
[40] TAHERI S, HESAMIAN G. A generalization of the Wilco-xon signed-rank test and its applications[J]. Statistical Pa-pers, 2013, 54(2): 457-470.
[41] HALSEY L G, CURRAN-EVERETT D, VOWLER S L, et al. The fickle P value generates irreproducible results[J]. Nature Methods, 2015, 12(3): 179-185.
[42] DOMINGOS P. A unified bias-variance decomposition for zero-one and squared loss[C]//Proceedings of the 17th National Conference on Artificial Intelligence and 12th Conference on Innovative Applications of Artificial Intelligence. Menlo Park: AAAI Press, 2000: 564-569.
[43] AOTANI T, KOBAYASHI T, SUGIMOTO K. Meta-optimi-zation of bias-variance trade-off in stochastic model learning[J]. IEEE Access, 2021, 9: 148783-148799.
[44] OSMAN H, GHAFARI M, NIERSTRASZ O, et al. An exten-sive analysis of efficient bug prediction configurations[C]//Proceedings of the 13th International Conference on Pre-dictive Models and Data Analytics in Software Enginee-ring, Toronto, Nov 8, 2017. New York: ACM, 2017: 107-116.
[45] SHEPPERD M, BOWES D, HALL T. Researcher bias: the use of machine learning in software defect prediction[J]. IEEE Transactions on Software Engineering, 2014, 40(6): 603-616.
[46] MYRTVEIT I, STENSRUD E, SHEPPERD M. Reliability and validity in comparative studies of software prediction models[J]. IEEE Transactions on Software Engineering, 2005, 31(5): 380-391.
[47] SHEN L, LIU W, CHEN X, et al. Improving machine learning-based code smell detection via hyper-parameter optimization[C]//Proceedings of the 27th Asia-Pacific Software Enginee-ring Conference, Singapore, Dec 1-4, 2020. Piscataway: IEEE, 2020: 276-285.
[48] GONG L, JIANG S, WANG R, et al. Empirical evaluation of the impact of class overlap on software defect prediction[C]//Proceedings of the 34th IEEE/ACM International Conference on Automated Software Engineering, San Diego, Nov 11-15, 2019. Piscataway: IEEE, 2019: 698-709.