Journal of Frontiers of Computer Science and Technology ›› 2023, Vol. 17 ›› Issue (4): 792-809.DOI: 10.3778/j.issn.1673-9418.2212070

• Frontiers·Surveys • Previous Articles     Next Articles

Research Progresses of Multi-modal Intelligent Robotic Manipulation

ZHANG Qiuju, LYU Qing   

  1. 1. School of Mechanical Engineering, Jiangnan University, Wuxi, Jiangsu 214122, China
    2. Jiangsu Key Laboratory of Advanced Food Manufacturing Equipment and Technology, Wuxi, Jiangsu 214122, China
  • Online:2023-04-01 Published:2023-04-01

机器人多模态智能操作技术研究综述

张秋菊,吕青   

  1. 1. 江南大学 机械工程学院,江苏 无锡 214122
    2. 江苏省食品先进制造装备技术重点实验室,江苏 无锡 214122

Abstract: The flexible production trend in manufacturing and the diversified expansion of applications in service industry have prompted fundamental changes in the application demands of robots. The uncertainty of tasks and environments imposes higher requirements on the intelligence of robotic manipulation. The use of multi-modal information to monitor robotic manipulation can effectively improve the intelligence and flexibility of robot. This paper provides an in-depth analysis of the role of multi-modal information in enhancing the intelligence of robotic manipulation from the perspective of multi-modal information fusion on the basis of the two key issues of manipulation cognition and manipulation control. Firstly, the concepts of intelligent robotic manipulation and multi-modal information are clarified, and the merits of applying multi-modal information are also introduced. Then, the commonly used perception models and control methods are deeply analyzed and the existing work is sorted out and introduced in a systematic way. According to different levels of perception goals, robotic manipulation perception is divided into object perception, constraint perception and state perception; according to different control methods, the most commonly used control fusion based on analysis model, imitation learning control and reinforcement learning control are introduced. Finally, the current technical challenges and potential development trends are also discussed.

Key words: multi-modal information, robot manipulation, robot cognition, robot control

摘要: 制造业柔性化的生产趋势、服务业应用场景的多元化扩展,促使机器人应用需求发生根本性变化,任务、环境的不确定性对机器人操作的智能程度提出了更高的要求。使用多模信息引导机器人操作,能够有效提高机器人自主性与易用性。围绕操作认知与操作控制两个关键问题,从多模态信息融合的角度,深入分析该手段对机器人操作智能化提升所起的作用。首先,明确了机器人智能操作与多模信息的具体概念,并阐述使用多模信息的优势;接着,深入分析了常用的认知计算模型与操作控制方法,并对现有工作展开系统性的梳理与介绍,依据认知目标层级的不同将机器人操作认知划分为对象认知、约束认知与状态认知三种类型,依据控制方法的不同介绍基于分析模型的控制融合、基于示教学习的控制和基于策略模型的控制三种常用的机器人操作控制模型;最后,分析了目前所面临的技术挑战并对其未来发展趋势做出了展望。

关键词: 多模信息, 机器人操作, 机器人认知, 机器人控制