计算机科学与探索 ›› 2012, Vol. 6 ›› Issue (2): 109-117.DOI: 10.3778/j.issn.1673-9418.2012.02.002

• 学术研究 • 上一篇    下一篇

综合包级和类级度量的软件缺陷预测方法

潘 森, 谭 曦, 彭 鑫, 赵文耘   

  1. 复旦大学 计算机科学技术学院, 上海 200433
  • 出版日期:2012-02-01 发布日期:2012-02-01

Improving Software Defect Prediction by Combining the Information of Class and Package

PAN Sen, TAN Xi, PENG Xin, ZHAO Wenyun   

  1. School of Computer Science, Fudan University, Shanghai 200433, China
  • Online:2012-02-01 Published:2012-02-01

摘要: 在基于软件产品度量值的缺陷预测中, 度量值主要是基于两个层次:类/文件层次和包/组件层次。类级别的预测模型通常会有更好的预测效率, 而包级别的模型往往能得到更好的查全率及查准率。提出综合类级别和包级别度量值进行缺陷预测的方法, 在类级别预测的基础上, 使用包级别预测的信息对类级别进行调整, 在类级别预测中融合包级别预测中所隐含的问题域信息。通过基于Eclipse3.0系统的实验发现, 该方法能够有效改善缺陷预测的效果。与类级别的缺陷预测模型相比, 综合包级别度量值的缺陷预测方法提高了5%到8%的查全率。同时在预测效率上, 测试出50%的缺陷, 使用该方法可以有效减少3.6%到9.84%的代码检查量。

关键词: 缺陷预测, 软件测试, 软件度量

Abstract: In defect prediction models built on software product metrics, metrics are usually collected from two levels: class/file and package/component. It is shown to all that at package-level better recall and precision can be achieved while at class-level the prediction is more efficient. This paper tries to combine the metrics and prediction results of the class-level and package-level. By adding the information of package-level prediction to the class-level, the problem domain, which may determine the defects, is taken into account. To evaluate the effectiveness of the proposed method, the paper carries an experiment study on Eclipse3.0 open source code, and compares it with the class-level result. The results show that the new method can improve the recall from 5% to 8%, and when half of the defects are checked, 3.6% to 9.84% less lines of codes can be read.

Key words: defect prediction, software testing, software metrics