计算机科学与探索 ›› 2023, Vol. 17 ›› Issue (9): 2075-2091.DOI: 10.3778/j.issn.1673-9418.2301067

• 前沿·综述 • 上一篇    下一篇

视觉导向的对抗型模仿学习研究综述

崔铭,龚声蓉   

  1. 苏州科技大学 电子与信息工程学院, 江苏 苏州 215004
  • 出版日期:2023-09-01 发布日期:2023-09-01

Survey on Visual-Guided Adversarial Imitation Learning

CUI Ming, GONG Shengrong   

  1. School of Electronic & Information Engineering, Suzhou University of Science and Technology, Suzhou, Jiangsu 215004, China
  • Online:2023-09-01 Published:2023-09-01

摘要: 最优决策问题在机器学习领域由来已久。模仿学习从强化学习发展而来,研究如何从专家数据中重建期望策略进而学习最优决策。近年来模仿学习既在理论研究中和计算机视觉有所结合,又在自动驾驶、机器人等应用中取得不错的成效。首先介绍模仿学习的由来及传统的两种研究方法,分别是行为克隆和逆强化学习,随着对抗训练结构的发展,生成对抗模仿学习成为现今的重点研究方向,而对其后续改进工作统称为对抗型模仿学习;其次分析了对抗型模仿学习结合视觉演示的研究内容,并针对存在的次优专家演示样本、少样本、样本利用效率低下等共性问题以及现有的对应改良方案进行总结;然后根据实验结果对比分析不同方法所解决的问题表现;最后说明对抗型视觉模仿学习在实际中的无人驾驶、工业机器人等场景的应用情况,总结并指出未来理论研究方向以及应用前景与挑战。

关键词: 模仿学习, 行为克隆, 逆强化学习, 对抗模仿学习

Abstract: The problem of optimal decision has a long history in the field of machine learning. Imitation learning, originating from reinforcement learning, is studied to reconstruct the expected policy from expert data and learn the optimal decision-making. In recent years, imitation learning has been successfully applied in both theoretical research and computer vision, as well as in various applications such as autonomous driving and robotics. The origin of imitation learning and the two traditional research methods, namely behavior cloning and inverse reinforcement learning, are introduced. With the development of adversarial training structures, generative adversarial imitation learning has become a key research direction, and its subsequent improvement work is collectively referred to as adversarial imitation learning. The research content of adversarial imitation learning combined with visual demonstrations is analyzed, along with summarizing common issues like suboptimal expert demonstrations, limited data, and low sample utilization efficiency, and the existing corresponding solutions. Then, the performance of different methods in addressing these problems is compared and analyzed based on experimental results. Finally, practical applications of adversarial visual imitation learning in scenarios such as autonomous driving and industrial robotics are discussed, and this paper is concluded by pointing out future research directions, as well as the potential prospects and challenges in applications.

Key words: imitation learning, behavior cloning, inverse reinforcement learning, adversarial imitation learning