计算机科学与探索 ›› 2024, Vol. 18 ›› Issue (4): 916-929.DOI: 10.3778/j.issn.1673-9418.2309010

• 前沿·综述 • 上一篇    下一篇

深度学习的三维模型识别研究综述

周燕,李文俊,党兆龙,曾凡智,叶德旺   

  1. 1. 佛山科学技术学院 电子信息工程学院,广东 佛山 528000
    2. 华南理工大学 计算机科学与工程学院,广州 510641
  • 出版日期:2024-04-01 发布日期:2024-04-01

Survey of 3D Model Recognition Based on Deep Learning

ZHOU Yan, LI Wenjun, DANG Zhaolong, ZENG Fanzhi, YE Dewang   

  1. 1. School of Electronic Information Engineering, Foshan University, Foshan, Guangdong 528000, China
    2. School of Computer Science and Engineering, South China University of Technology, Guangzhou 510641, China
  • Online:2024-04-01 Published:2024-04-01

摘要: 随着三维扫描仪、LiDAR等三维视觉感知设备的快速发展,三维模型识别方向正逐渐引起越来越多的研究者的关注。该领域的核心任务是三维模型的分类与检索。深度学习技术在二维视觉任务方面已经取得显著的成就,将这一技术引入三维视觉领域不仅突破了传统方法的限制,还在自动驾驶、智能机器人等领域取得了引人瞩目的进展。然而,将深度学习技术应用于三维模型识别任务仍然面临着多项挑战。鉴于此,对深度学习在三维模型识别任务中的应用进行综述。首先,论述了常用的评价指标和公开数据集,介绍每个数据集的相关信息和来源。接着,从多个角度出发,包括点云、视图、体素以及多模态融合等,详细介绍现有具有代表性的方法,并梳理了近年来的相关研究工作。通过在数据集上对这些方法的性能进行对比,分析各个方法的优势和局限性。最后,基于各类方法的利弊,总结当前亟待解决的三维模型识别任务中的挑战,并展望了未来在该领域的发展趋势。

关键词: 三维视觉, 深度学习, 点云, 视图, 体素, 多模态

Abstract: With the rapid advancement of three-dimensional visual perception devices such as 3D scanners and LiDAR, the field of 3D model recognition is gradually gaining the attention of a growing number of researchers. This domain encompasses two core tasks: 3D model classification and retrieval. Since deep learning has already achieved significant success in two-dimensional visual tasks, its introduction into the realm of three-dimensional visual perception not only breaks free from the constraints of traditional methods but also makes notable strides in areas such as autonomous driving and intelligent robotics. However, the application of deep learning techniques to 3D model recognition tasks still faces several challenges. In light of this, there is a need for a comprehensive review of the application of deep learning in 3D model recognition. This review begins by discussing commonly used evaluation metrics and public datasets, providing relevant information and sources for each dataset. Subsequently, it delves into representative methods from various angles, including point clouds, views, voxels, and multimodal fusion. It also summarizes recent research development in the field. Through performance comparison on these datasets, the strengths and limitations of each method are analyzed. Finally, based on the merits and demerits of these approaches, the review outlines the challenges currently faced by 3D model recognition tasks and provides an outlook on future trends in this field.

Key words: three-dimensional vision, deep learning, point clouds, views, voxels, multimodal