计算机科学与探索 ›› 2023, Vol. 17 ›› Issue (4): 792-809.DOI: 10.3778/j.issn.1673-9418.2212070
张秋菊,吕青
出版日期:
2023-04-01
发布日期:
2023-04-01
ZHANG Qiuju, LYU Qing
Online:
2023-04-01
Published:
2023-04-01
摘要: 制造业柔性化的生产趋势、服务业应用场景的多元化扩展,促使机器人应用需求发生根本性变化,任务、环境的不确定性对机器人操作的智能程度提出了更高的要求。使用多模信息引导机器人操作,能够有效提高机器人自主性与易用性。围绕操作认知与操作控制两个关键问题,从多模态信息融合的角度,深入分析该手段对机器人操作智能化提升所起的作用。首先,明确了机器人智能操作与多模信息的具体概念,并阐述使用多模信息的优势;接着,深入分析了常用的认知计算模型与操作控制方法,并对现有工作展开系统性的梳理与介绍,依据认知目标层级的不同将机器人操作认知划分为对象认知、约束认知与状态认知三种类型,依据控制方法的不同介绍基于分析模型的控制融合、基于示教学习的控制和基于策略模型的控制三种常用的机器人操作控制模型;最后,分析了目前所面临的技术挑战并对其未来发展趋势做出了展望。
张秋菊, 吕青. 机器人多模态智能操作技术研究综述[J]. 计算机科学与探索, 2023, 17(4): 792-809.
ZHANG Qiuju, LYU Qing. Research Progresses of Multi-modal Intelligent Robotic Manipulation[J]. Journal of Frontiers of Computer Science and Technology, 2023, 17(4): 792-809.
[1] 黄艳龙, 徐德, 谭民. 机器人运动轨迹的模仿学习综述[J]. 自动化学报, 2022, 48(2): 315-334. HUANG Y L, XU D, TAN M. On imitation learning of robot movement trajectories: a survey[J]. Acta Automatica Sinica, 2022, 48(2): 315-334. [2] MASON M T. Toward robotic manipulation[J]. Annual Review of Control, Robotics, and Autonomous Systems, 2018, 1(1): 1-28. [3] XIA Z, DENG Z, FANG B, et al. A review on sensory perception for dexterous robotic manipulation[J]. International Journal of Advanced Robotic Systems, 2022, 19(2): 1729-8814. [4] 陈静, 王虹, 袁冠杰, 等. 基于卡尔曼滤波的多传感器时间配准方法[J]. 微波学报, 2021, 37(S1): 237-240. CHEN J, WANG H, YUAN G J, et al. Multi sensor time registration method based on Kalman filter[J]. Journal of Microwaves, 2021, 37(S1): 237-240. [5] ILONEN J, BOHG J, KYRKI V. Fusing visual and tactile sensing for 3-D object reconstruction while grasping[C]//Proceedings of the 2013 IEEE International Conference on Robotics and Automation, Karlsruhe, May 6-10, 2013. Piscataway: IEEE, 2013: 3547-3554. [6] PFANNE M, CHALON M. EKF-based in-hand object localization from joint position and torque measurements[C]//Proceedings of the 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems, Vancouver, Sep 24-28, 2017. Piscataway: IEEE, 2017: 2464-2470. [7] KOVAL M C, POLLARD N S, SRINIVASA S S. Pose estimation for planar contact manipulation with manifold particle filters[J]. International Journal of Robotics Research, 2015, 34(7): 922-945. [8] LI S, LYU S, TRINKLE J. State estimation for dynamic systems with intermittent contact[C]//Proceedings of the 2015 IEEE International Conference on Robotics and Automation, Seattle, May 26-30, 2015. Piscataway: IEEE, 2015: 3709-3715. [9] LI S, LYU S, TRINKLE J. Efficient state estimation with constrained rao-blackwellized particle filter[C]//Proceedings of the 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems, Madrid, Oct 1-5, 2018. Piscataway: IEEE, 2018: 6682-6689. [10] LEE M A, YI B, MARTIN-MARTIN R, et al. Multimodal sensor fusion with differentiable filters[C]//Proceedings of the 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems, Las Vegas, Oct 24-Jan 24, 2021. Piscataway: IEEE, 2020: 10444-10451. [11] BHATTACHARJEE T, CLEVER H M, WADE J, et al. Multimodal tactile perception of objects in a real home[J]. IEEE Robotics and Automation Letters, 2018, 3(3): 2523-2530. [12] LIMA B M R, ALVES DE OLIVEIRA T E, DA FONSECA V P. Classification of textures using a tactile-enabled finger in dynamic exploration tasks[C]//Proceedings of the 2021 IEEE Sensors, Sydney, Oct 31-Nov 3, 2021. Piscataway: IEEE, 2021: 1-4. [13] PASTOR F, GARCIA-GONZALEZ J, GANDARIAS J M, et al. Bayesian and neural inference on LSTM-based object recognition from tactile and kinesthetic information[J]. IEEE Robotics and Automation Letters, 2021, 6(1): 231-238. [14] ARRIOLA-RIOS V E, WYATT J L. A multimodal model of object deformation under robotic pushing[J]. IEEE Transactions on Cognitive and Developmental Systems, 2017, 9(2): 153-169. [15] 王业飞, 葛泉波, 刘华平, 等. 机器人视觉听觉融合的感知操作系统[J/OL]. 智能系统学报[2022-10-30]. http://kns. cnki.net/kcms/detail/23.1538.tp.20221025.1544.004.html. WANG Y F, GE Q B, LIU H P, et al. Robot visual perception of auditory fusion operating system[J]. CAAI Transactions on Intelligent Systems [2022-10-30]. http://kns. cnki.net/kcms/detail/23.1538.tp.20221025.1544.004.html. [16] WANG C, XU D, ZHU Y, et al. DenseFusion: 6D object pose estimation by iterative dense fusion[C]//Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, Jun 15-20, 2019. Piscataway: IEEE, 2019: 3338-3347. [17] WANG H, SRIDHAR S, HUANG J, et al. Normalized object coordinate space for category-level 6D object pose and size estimation[C]//Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, Jun 15-20, 2019. Piscataway: IEEE, 2019: 2637-2646. [18] WEN B, LIAN W, BEKRIS K, et al. CaTGrasp: learning category-level task-relevant grasping in clutter from simulation[C]//Proceedings of the 2022 International Conference on Robotics and Automation, Philadelphia, May 23-27, 2022. Piscataway: IEEE, 2022: 6401-6408. [19] WANG S, WU J, SUN X, et al. 3D shape perception from monocular vision, touch, and shape priors[C]//Proceedings of the 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems, Madrid, Oct 1-5, 2018. Piscataway: IEEE, 2018: 1606-1613. [20] WATKINS-VALLS D, VARLEY J, ALLEN P. Multi-modal geometric learning for grasping and manipulation[J]. arXiv: 180307671, 2018. [21] PEKASLAN D, WAGNER C, GARIBALDI J M. VisuoTactile 6D pose estimation of an in-hand object using vision and tactile sensor data[J]. IEEE Transactions on Fuzzy Systems, 2020, 28(10): 2302-2312. [22] KROEMER O, PETERS J. A comparison of autoregressive hidden Markov models for multimodal manipulations with variable masses[J]. IEEE Robotics and Automation Letters, 2017, 2(2): 1101-1108. [23] CHU F J, XU R, SEGUIN L, et al. Toward affordance detection and ranking on novel objects for real-world robotic manipulation[J]. IEEE Robotics and Automation Letters, 2019, 4(4): 4070-4077. [24] GUO D, SUN F, LIU H, et al. A hybrid deep architecture for robotic grasp detection[C]//Proceedings of the 2017 IEEE International Conference on Robotics and Automation, Singapore, May 29-Jun 3, 2017. Piscataway: IEEE, 2017: 1609-1614. [25] CHU F J, XU R, VELA P A. Real-world multiobject, multigrasp detection[J]. IEEE Robotics and Automation Letters, 2018, 3(4): 3355-3362. [26] DO T T, NGUYEN A, REID I. AffordanceNet: an end-to-end deep learning approach for object affordance detection[C]//Proceedings of the 2018 IEEE International Conference on Robotics and Automation, Brisbane, May 21-25, 2018. Piscataway: IEEE, 2018: 5882-5889. [27] LI Z, OKADA K, INABA M. Affordance action learning with state trajectory representation for robotic manipulation[C]//Proceedings of the 2019 IEEE-RAS 19th International Conference on Humanoid Robots, Toronto, Oct 15-17, 2019. Piscataway: IEEE, 2019: 607-613. [28] SLOTH C, ITURRATE I. Simultaneous contact point and surface normal estimation during soft finger contact[C]//Proceedings of the 2021 20th International Conference on Advanced Robotics, Ljubljana, Dec 6-10, 2021. Piscataway: IEEE, 2021: 19-25. [29] HU J, XIONG R. Contact force estimation for robot manipulator using semiparametric model and disturbance Kalman filter[J]. IEEE Transactions on Industrial Electronics, 2018, 65(4): 3365-3375. [30] CALANDRA R, OWENS A, JAYARAMAN D, et al. More than a feeling: learning to grasp and regrasp using vision and touch[J]. IEEE Robotics and Automation Letters, 2018, 3(4): 3300-3307. [31] YOO Y, LEE C Y, ZHANG B T. Multimodal anomaly detection based on deep auto-encoder for object slip perception of mobile manipulation robots[C]//Proceedings of the 2021 IEEE International Conference on Robotics and Automation, Xi’an, May 30-Jun 5, 2021. Piscataway: IEEE, 2021: 11443-11449. [32] YU K, RODRIGUEZ A. Realtime state estimation with tactile and visual sensing. Application to planar manipulation[C]//Proceedings of the 2018 IEEE International Conference on Robotics and Automation, Brisbane, May 21-25, 2018. Piscataway: IEEE, 2018: 7778-7785. [33] LAMBERT A S, MUKADAM M, SUNDARALINGAM B, et al. Joint inference of kinematic and force trajectories with visuo-tactile sensing[C]//Proceedings of the 2019 International Conference on Robotics and Automation, Montreal, May 20-24, 2019. Piscataway: IEEE, 2019: 3165-3171. [34] LEE H, PARK S, JANG K, et al. Contact state estimation for peg-in-hole assembly using Gaussian mixture model[J]. IEEE Robotics and Automation Letters, 2022, 7(2): 3349-3356. [35] JASIM I F, PLAPPER P W. Contact-state monitoring of force-guided robotic assembly tasks using expectation maximization-based Gaussian mixtures models[J]. The International Journal of Advanced Manufacturing Technology, 2014, 73(5): 623-633. [36] DONG-HYEONG L, MIN-WOO N, YOUNG-LOUL K, et al. Model based assembly state estimation algorithm for the components of Tablet PC[C]//Proceedings of the 2017 14th International Conference on Ubiquitous Robots and Ambient Intelligence, Maison Glad Jeju, Jun 28-Jul 1, 2017. Piscataway: IEEE, 2017: 740-743. [37] ZHANG K, SHARMA M, VELOSO M, et al. Leveraging multimodal haptic sensory data for robust cutting[C]//Proceedings of the 2019 IEEE-RAS 19th International Conference on Humanoid Robots, Toronto, Oct 15-17, 2019. Piscataway: IEEE, 2019: 409-416. [38] WU H, XU Z, YAN W, et al. Incremental learning introspective movement primitives from multimodal unstructured demonstrations[J]. IEEE Access, 2019, 7: 159022-159036. [39] ZHE S, KROEMER O, LOEB G E, et al. Learning to switch between sensorimotor primitives using multimodal haptic signals[C]//LNCS 9825: Proceedings of the 14th International Conference on Simulation of Adaptive Behavior, Aberystwyth, Aug 23-26, 2016. Cham: Springer, 2016: 170-182. [40] SU Z, KROEMER O, LOEB G E, et al. Learning manipulation graphs from demonstrations using multimodal sensory signals[C]//Proceedings of the 2018 IEEE International Conference on Robotics and Automation, Brisbane, May 21-25, 2018. Piscataway: IEEE, 2018: 2758-2765. [41] PARK D, HOSHI Y, KEMP C C. A multimodal anomaly detector for robot-assisted feeding using an LSTM-based variational autoencoder[J]. IEEE Robotics and Automation Letters, 2018, 3(3): 1544-1551. [42] SHI Y, CHEN Z, LIU H, et al. Proactive action visual residual reinforcement learning for contact-rich tasks using a torque-controlled robot[C]//Proceedings of the 2021 IEEE International Conference on Robotics and Automation, Xi’an, May 30-Jun 5, 2021. Piscataway: IEEE, 2021: 765-771. [43] LEE M A, TAN M, ZHU Y, et al. Detect, reject, correct: crossmodal compensation of corrupted sensors[C]//Proceedings of the 2021 IEEE International Conference on Robotics and Automation, Xi’an, May 30-Jun 5, 2021. Piscataway: IEEE, 2021: 909-916. [44] ANZAI T, TAKAHASHI K. Deep gated multi-modal learning: in-hand object pose changes estimation using tactile and image data[C]//Proceedings of the 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems Las Vegas, May 11-15, 2002. Piscataway: IEEE, 2020: 9361-9368. [45] 胡瑞钦, 隆昌宇, 张立建. 视觉与力觉结合的卫星部件机器人装配[J]. 光学精密工程, 2018, 26(10): 2504-2515. HU R Q, LONG C Y, ZHANG L J. Robotic assembly technology for satellite components based on visual and force information[J]. Optics and Precision Engineering, 2018, 26(10): 2504-2515. [46] 成天佑, 林艳萍, 马晓军. 一种视触觉引导的超声探头自动定位方法[J]. 西安电子科技大学学报, 2020, 47(1): 80-87. CHENG T Y, LIN Y P, MA X J. Ultrasound probe guiding method using vision and force[J]. Journal of Xidian University, 2020, 47(1): 80-87. [47] 李林峰, 王勇, 解永春, 等. 多视角视觉目标生成的空间机器人操作学习[J]. 空间控制技术与应用, 2022, 48(2): 18-28. LI L F, WANG Y, XIE Y C, et al. Learning space robotic manipulation via multi-view visual goal generation[J]. Aerospace Control and Application, 2022, 48(2): 18-28. [48] SHE Y, WANG S, DONG S, et al. Cable manipulation with a tactile-reactive gripper[J]. International Journal of Robotics Research, 2021, 40: 1385-1401. [49] 吴炳龙, 曲道奎, 徐方. 基于力/位混合控制的工业机器人精密轴孔装配[J]. 浙江大学学报(工学版), 2018, 52(2): 379-386. WU B L, QU D K, XU F. Industrial robot high precision peg-in-hole assembly based on hybrid force/position control[J]. Journal of Zhejiang University (Engineering Edition), 2018, 52(2): 379-386. [50] 付天宇, 李凤鸣, 崔涛, 等. 基于视觉/力觉的机器人协同缝制系统[J]. 机器人, 2022, 44(3): 352-360. FU T Y, LI F M, CUI T, et al. A robotic collaborative sewing system based on visual and force perception[J]. Robot, 2022, 44(3): 352-360. [51] 高霄. 基于模仿学习的机械臂运动规划与柔顺控制研究[D]. 武汉: 武汉大学, 2021. GAO X. Research on motion planning and flexibility control of manipulator based on imitation learning[D]. Wuhan: Wuhan University, 2021. [52] SAVARIMUTHU T R, BUCH A G, SCHLETTE C, et al. Teaching a robot the semantics of assembly tasks[J]. IEEE Transactions on Systems, Man, and Cybernetics: Systems, 2018, 48(5): 670-692. [53] ZHANG T, MCCARTHY Z, JOW O, et al. Deep imitation learning for complex manipulation tasks from virtual reality teleoperation[C]//Proceedings of the 2018 IEEE International Conference on Robotics and Automation, Brisbane, May 21-25, 2018. Piscataway: IEEE, 2018: 5628-5635. [54] IJSPEERT A J, NAKANISHI J, SCHAAL S. Movement imitation with nonlinear dynamical systems in humanoid robots[C]//Proceedings of the 2002 IEEE International Conference on Robotics and Automation, Washington, May 11-15, 2002. Piscataway: IEEE, 2002: 1398-1403. [55] 张文安, 高伟展, 刘安东. 基于动态运动原语和自适应控制的机器人技能学习[J]. 上海交通大学学报[2022-10-30]. https://doi.org/10.16183/j.cnki.jsjtu.2021.379. ZHANG W A, GAO W Z, LIU A D. Robot skill learning based on dynamic motion primitive and adaptive control[J]. Journal of Shanghai Jiaotong University [2022-10-30]. https://doi.org/10.16183/j.cnki.jsjtu.2021.379. [56] 于建均, 姚红柯, 左国玉, 等. 基于动态系统的机器人模仿学习方法研究[J]. 智能系统学报, 2019, 14(5): 1026-1034. YU J J, YAO H K, ZUO G Y, et al. Research on robot imitation learning method based on dynamical system[J]. CAAI Transactions on Intelligent Systems, 2019, 14(5): 1026-1034. [57] DENISA M, GAMS A, UDE A, et al. learning compliant movement primitives through demonstration and statistical generalization[J]. IEEE/ASME Transactions on Mechatronics, 2016, 21(5): 2581-2594. [58] 张会文, 张伟, 周维佳. 基于交叉熵优化的高斯混合模型运动编码[J]. 机器人, 2018, 40(4): 569-576. ZHANG H W, ZHANG W, ZHOU W J. Encoding motor skills with Gaussian mixture models optimized by the cross entropy method[J]. Robot, 2018, 40(4): 569-576. [59] PETERNEL L, PETRI? T, BABI? J. Human-in-the-loop approach for teaching robot assembly tasks using impedance control interface[C]//Proceedings of the 2015 IEEE International Conference on Robotics and Automation, Seattle, May 26-30, 2015. Piscataway: IEEE, 2015: 1497-1502. [60] ROZO L, BRUNO D, CALINON S, et al. Learning optimal controllers in human-robot cooperative transportation tasks with position and force constraints[C]//Proceedings of the 2015 IEEE/RSJ International Conference on Intelligent Robots & Systems, Hamburg, Sep 28-Oct 2, 2015. Piscataway: IEEE, 2015: 1024-1030. [61] LE A T, GUO M, DUIJKEREN N V, et al. Learning forceful manipulation skills from multi-modal human demonstrations[C]//Proceedings of the 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems, Prague, Sep 27-Oct 1, 2021. Piscataway: IEEE, 2021: 7770-7777. [62] 段宝阁, 杨尚尚, 谢啸, 等. 基于模仿学习的双曲率曲面零件复合材料织物机器人铺放[J]. 机器人, 2022, 44(4): 504-512. DUAN B G, YANG S S, XIE X, et al. Robotic layup of woven composite on double curvature surface parts based on imitation learning[J]. Robot, 2022, 44(4): 504-512. [63] LEVINE S, FINN C, DARRELL T, et al. End-to-end training of deep visuomotor policies[J]. Journal of Machine Learning Research, 2016, 17(1): 1334-1373. [64] FINN C, LEVINE S. Deep visual foresight for planning robot motion[C]//Proceedings of the 2017 IEEE International Conference on Robotics and Automation, Singapore, May 29-Jun 3, 2017. Piscataway: IEEE, 2017: 2786-2793. [65] LEE M A, ZHU Y, ZACHARES P, et al. Making sense of vision and touch: learning multimodal representations for contact-rich tasks[J]. IEEE Transactions on Robotics, 2020, 36(3): 582-596. [66] BALAKUNTALA M V, KAUR U, MA X, et al. Learning multimodal contact-rich skills from demonstrations without reward engineering[C]//Proceedings of the 2021 IEEE International Conference on Robotics and Automation, Xi’an, May 30-Jun 5, 2021. Piscataway: IEEE, 2021: 4679-4685. [67] MARTIN-MARTIN R, LEE M A, GARDNER R, et al. Variable impedance control in end-effector space: an action space for reinforcement learning in contact-rich tasks[C]//Proceedings of the 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems, Macau, China, Nov 3-8, 2019. Piscataway: IEEE, 2019. [68] NARITA T, KROEMER O. Policy blending and recombination for multimodal contact-rich tasks[J]. IEEE Robotics and Automation Letters, 2021, 6(2): 2721-2728. [69] HAARNOJA T, PONG V, ZHOU A, et al. Composable deep reinforcement learning for robotic manipulation[C]//Proceedings of the 2018 IEEE International Conference on Robotics and Automation, Brisbane, May 21-25, 2018. Piscataway: IEEE, 2018: 6244-6251. [70] LEE M A, FLORENSA C, TREMBLAY J, et al. Guided uncertainty-aware policy optimization: combining learning and model-based strategies for sample-efficient policy lear-ning[C]//Proceedings of the 2020 IEEE International Con-ference on Robotics and Automation, Paris, May 31-Aug 31, 2020. Piscataway: IEEE, 2020: 7505-7512. [71] XU J, HOU Z, WANG W, et al. Feedback deep deterministic policy gradient with fuzzy reward for robotic multiple peg-in-hole assembly tasks[J]. IEEE Transactions on Industrial Informatics, 2019, 15(3): 1658-1667. [72] YOON J, LEE M, SON D, et al. Fast and accurate data-driven simulation framework for contact-intensive tight-tolerance robotic assembly tasks[J]. arXiv:2202.13098, 2022. [73] ZHAO W, QUERALTA J P, WESTERLUND T. Sim-to-real transfer in deep reinforcement learning for robotics: a survey[C]//Proceedings of the 2020 IEEE Symposium Series on Computational Intelligence, Canberra, Dec 1-4, 2020. Piscataway: IEEE, 2020: 737-744. [74] LIANG H, CONG L, HENDRICH N, et al. Multifingered grasping based on multimodal reinforcement learning[J]. IEEE Robotics and Automation Letters, 2022, 7(2): 1174-1181. [75] 吴培良, 刘瑞军, 李瑶, 等. 一种基于生成对抗网络与模型泛化的机器人推抓技能学习方法[J]. 仪器仪表学报, 2022, 43(5): 244-253. WU P L, LIU R J, LI Y, et al. Robot pushing and grasping skill learning method based on generative adversarial network and model generalization[J]. Chinese Journal of Scientific Instrument, 2022, 43(5): 244-253. [76] 张夏禹, 陈小平. 基于目标的域随机化方法在机器人操作方面的研究[J]. 计算机应用研究, 2022, 39(10): 3084-3088. ZHANG X Y, CHEN X P. Research on goal-based domain randomization method in robot manipulation[J]. Application Research of Computers, 2022, 39(10): 3084-3088. |
No related articles found! |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||