计算机科学与探索 ›› 2012, Vol. 6 ›› Issue (1): 1-31.DOI: 10.3778/j.issn.1673-9418.2012.01.001

• 综述·探索 • 上一篇    下一篇

软件工程数据挖掘研究进展

郁抒思, 周水庚, 关佶红   

  1. 1. 复旦大学 计算机科学技术学院, 上海 200433
    2. 复旦大学 上海市智能信息处理重点实验室, 上海 200433
    3. 同济大学 计算机科学与技术系, 上海 201804
  • 出版日期:2012-01-01 发布日期:2012-01-01

Software Engineering Data Mining: A Survey

YU Shusi, ZHOU Shuigeng, GUAN Jihong   

  1. 1. School of Computer Science, Fudan University, Shanghai 200433, China
    2. Shanghai Key Lab of Intelligent Information Processing, Fudan University, Shanghai 200433, China 3. Department of Computer Science and Technology, Tongji University, Shanghai 201804, China
  • Online:2012-01-01 Published:2012-01-01

摘要: 随着计算机软件的规模不断扩大, 手工获取、开发和维护软件所需的信息越来越困难。数据挖掘技术可从软件工程数据中自动发现所需信息, 加快软件开发进程。对软件工程数据挖掘的研究进展进行了综述。概述了软件工程数据挖掘的基本概念与技术挑战; 详细评述了在软件工程各个阶段, 数据挖掘技术所能发现的信息/知识, 以及获取这些信息/知识的意义、难点、步骤和方法, 重点介绍了数据预处理和数据表示方法; 对软件工程数据挖掘研究的发展趋势进行了展望。

关键词: 软件工程, 数据挖掘, 数据表示, 数据预处理, 机器学习

Abstract: With the rapid enlargement of software scale, to retrieve manually the relevant information of software development and maintenance is becoming more and more difficult. Data mining technology can help to discover useful information from software engineering data automatically, which thus speeds up the process of software de-velopment. This paper surveys the state of the art techniques of software engineering data mining. First, it presents basic concepts and technical challenges of software engineering data mining. Then, it discusses the details of data mining at different phases of software engineering, including motivation, problems, procedures and approaches, specifically, it emphasizes the methods of data pre-processing and representation. Finally, it gives a vision of future development of software engineering data mining technology.

Key words: software engineering, data mining, data representation, data pre-processing, machine learning