Journal of Frontiers of Computer Science and Technology ›› 2022, Vol. 16 ›› Issue (9): 2011-2029.DOI: 10.3778/j.issn.1673-9418.2110073

• Surveys and Frontiers • Previous Articles     Next Articles

Overview of Deep Learning-Based Code Representation and Its Applications

ZHANG Xiangping1,2, LIU Jianxun1,2,+()   

  1. 1. Hunan Key Lab for Services Computing and Novel Software Technology, Hunan University of Science and Technology, Xiangtan, Hunan 411201, China
    2. School of Computer Science and Engineering, Hunan University of Science and Technology, Xiangtan, Hunan 411201, China
  • Received:2021-10-28 Revised:2022-04-21 Online:2022-09-01 Published:2022-09-15
  • About author:ZHANG Xiangping, born in 1993, Ph.D. candidate. His research interests include code representation and code clone detection.
    LIU Jianxun, born in 1970, Ph.D., professor. His research interests include service computing and cloud computing.
  • Supported by:
    National Natural Science Foundation of China(61872139)


张祥平1,2, 刘建勋1,2,+()   

  1. 1.湖南科技大学 服务计算与软件服务新技术湖南省重点实验室,湖南 湘潭 411201
    2.湖南科技大学 计算机科学与工程学院,湖南 湘潭 411201
  • 通讯作者: + E-mail:
  • 作者简介:张祥平(1993—),男,福建三明人,博士研究生,主要研究方向为代码表征、代码克隆检测。
  • 基金资助:


The analysis and inference of program play an important role in software development, maintenance and migration. How to efficiently obtain high quality information from program code has become a hot research topic. In recent years, a large number of researchers have introduced the deep learning-based representation technology into the code analysis tasks. The deep learning model can automatically extract the implicit and useful features implicit in the source code, which can alleviate the dependence on the manual construct feature. This paper first introduces the background and basic concepts of code representation, and summarizes the recent research work on deep learning-based code representation learning from the perspective of code static information analysis. Furthermore, this paper introduces the application of code representation on three tasks, code clone detection, code search and code completion. Finally, it discusses the challenges of deep learning-based code representation and the possible research directions in this field.

Key words: code representation, representation learning, software engineering, code analysis, deep learning



关键词: 代码表征, 表征学习, 软件工程, 代码分析, 深度学习

CLC Number: