计算机科学与探索 ›› 2023, Vol. 17 ›› Issue (3): 549-560.DOI: 10.3778/j.issn.1673-9418.2209014

• 前沿·综述 • 上一篇    下一篇

基于深度学习的视觉惯性里程计技术综述

王文森,黄凤荣,王旭,刘庆璘,羿博珩   

  1. 1. 河北工业大学 机械工程学院,天津 300401
    2. 中国人民解放军93756部队
  • 出版日期:2023-03-01 发布日期:2023-03-01

Overview of Visual Inertial Odometry Technology Based on Deep Learning

WANG Wensen, HUANG Fengrong, WANG Xu, LIU Qinglin, YI Boheng   

  1. 1. College of Mechanical Engineering, Hebei University of Technology, Tianjin 300401, China
    2. 93756 Unit of PLA, China
  • Online:2023-03-01 Published:2023-03-01

摘要: 视觉惯性里程计在很多方面可以很好地实现视觉和惯性传感器的优势互补,获得高精度的6自由度导航定位,因此应用领域极为广泛。然而,传感器自身的误差、异常视觉环境的扰动、多传感器之间的时空校准误差都会干扰导航结果,导致导航精度下降。近年来,正在迅速发展的深度学习方法凭借其强大的数据处理和预测能力,给视觉惯性里程计的发展提供了全新的发展方向。对基于深度学习的视觉惯性里程计的主要发展成果进行了回顾与总结。首先,按照两种融合策略分别概述研究方法,包括深度学习与传统模型结合的方法和基于深度学习的端到端的方法。之后,根据深度学习类型分为监督学习和无监督/自监督学习的方法,并分别阐述了这些方法的模型结构。然后,概述了系统的优化与评估方法,并比较了其中一些具有代表性的方法的性能。最后,对该领域需要解决的关键难点问题进行了总结,对未来发展进行了展望。

关键词: 视觉惯性里程计, 特征融合, 深度学习, 网络模型, 位姿

Abstract: Visual inertial odometer can well realize the complementary advantages of vision and inertial sensors, and obtain high precision 6-DOF navigation and positioning, so it has a very wide range of applications. However, the errors of sensors themselves, the disturbance of abnormal visual environment, and the space-time calibration errors between multi-sensor will interfere with the navigation results, leading to the decline of navigation accuracy. In recent years, the deep learning method is developing rapidly. With its powerful data processing and prediction ability, it provides a new direction for the development of visual inertial odometer. This paper reviews the main development achievements of deep learning-based methods. First of all, according to the fusion mode, the research methods are summarized, which are divided into the method combining deep learning with traditional models and the end-to-end method based on deep learning. Then, according to the type of deep learning, visual inertial odometer can be divided into supervised learning and unsupervised/self-supervised learning methods, and the model structures of these methods are described respectively. Next, the optimization and evaluation methods of the system are summarized, and the performance of some of them is compared. Finally, this paper summarizes the key and difficult problems that need to be solved in this field, and looks forward to the future development.

Key words: visual inertial odometry, feature fusion, deep learning, network model, posture