Journal of Frontiers of Computer Science and Technology ›› 2022, Vol. 16 ›› Issue (2): 261-279.DOI: 10.3778/j.issn.1673-9418.2107040

• Surveys and Frontiers • Previous Articles     Next Articles

Review of Reinforcement Learning for Combinatorial Optimization Problem

WANG Yang, CHEN Zhibin+(), WU Zhaorui, GAO Yuan   

  1. Faculty of Science, Kunming University of Science and Technology, Kunming 650000, China
  • Received:2021-07-12 Revised:2021-09-16 Online:2022-02-01 Published:2021-09-24
  • About author:WANG Yang, born in 1997, M.S. candidate. His research interests include combinatorial optimization, reinforcement learning, deep reinforcement learning, etc.
    CHEN Zhibin, born in 1979, Ph.D., associate professor, M.S. supervisor. His research interests include combinatorial optimization, graph theory, approximate algorithms and operations research.
    WU Zhaorui, born in 1998, M.S. candidate. Her research interest is combinatorial optimization.
    GAO Yuan, born in 1997, M.S. candidate. Her research interest is combinatorial optimization.
  • Supported by:
    National Natural Science Foundation of China(11761042)


王扬, 陈智斌+(), 吴兆蕊, 高远   

  1. 昆明理工大学 理学院,昆明 650000
  • 通讯作者: + E-mail:
  • 作者简介:王扬(1997—),男,山东烟台人,硕士研究生,主要研究方向为组合最优化、强化学习、深度强化学习等。
  • 基金资助:


The solution methods for combinatorial optimization problem (COP) have permeated to the fields of artificial intelligence, operations research, etc. With the scale of data increasing and the speed of problem updating being faster, the traditional method of solving the COP is challenged in computational speed, precision and generali-zation ability. In recent years, reinforcement learning (RL) has been widely used in driverless, industrial automation and other fields, showing strong decision-making and learning ability. Thus, many researchers have strived to use RL to solve COP, which provides a novel method for solving these problems. This paper firstly introduces the common COP problems and the basic principles of RL. Then, this paper elaborates the difficulties of RL in solving COP, analyzes the advantages of RL in combinatorial optimization (CO) field, and studies the principle of the combina-tion of RL and COP. Subsequently, this paper summarizes the theoretical methods and applied researches of solving COP problems utilizing RL in recent years. In order to highlight the superiority of RL model, this paper also com-pares and analyzes the key points, algorithmic logic and optimization effect of various representative researches in solving COP problem, and sums up the limitations of different methods and their application fields. Finally, this paper proposes four potential research directions.

Key words: reinforcement learning (RL), deep reinforcement learning (DRL), combinatorial optimization problem (COP)



关键词: 强化学习(RL), 深度强化学习(DRL), 组合最优化问题(COP)

CLC Number: