3D Model Recognition Based on Deep Graph Attention CNN

doi:10.3778/j.issn.1673-9418.2003005

Abstract

Abstract:

In order to solve the problem that the existing 3D model recognition methods based on deep learning lack the contextual fine-grained local features of 3D models, which may cause the confusion of recognition with very similar geometric shapes and slightly different local details, a 3D model recognition method based on deep graph attention convolutional neural network is proposed. Firstly, the fine-grained local features of 3D models are mined by introducing a neighborhood selection mechanism. Secondly, the multi-scale spatial context information is captured by a spatial context coding mechanism, and is compensated with fine-grained local features to enhance the completeness of features. Finally, a multi-head mechanism is adopted to make the graph attention convolution layer aggregate features of multiple single-head to enhance the richness of features. In addition, a selective dropout algorithm is designed to prevent network overfitting, which ranks the importance of neurons according to value of the measure-ment weight, and intelligently discards those with lower importance. The accuracy of 3D model recognition on the ModelNet40 dataset of the algorithm in this paper reaches 92.6%, and the network complexity is low. The trade-off between accuracy rate of 3D model recognition and network complexity achieved by the proposed algorithm is superior to contemporary mainstream methods.

Key words: machine vision, 3D model recognition, graph attention convolution layer, convolutional neural network(CNN), selectable dropout

摘要：

针对现有基于深度学习的三维模型识别方法缺乏结合三维模型的上下文细粒度局部特征，可能造成几何形状极其相似，局部细节信息略有不同的类识别混淆的问题，提出一种基于深度图注意力卷积神经网络的三维模型识别方法。首先，通过引入邻域选择机制挖掘三维模型的细粒度局部特征。其次，通过空间上下文编码机制捕捉多尺度空间上下文信息，且与细粒度局部特征相互补偿以增强特征的完备性。最后，采用一种多头部机制，使图注意力卷积层聚合多个单头部的特征以增强特征的丰富性。此外，设计选择性丢弃算法，根据度量权重值对神经元重要性进行排序，智能地丢弃重要性较低的神经元来防止网络过拟合。算法在ModelNet40数据集上的三维模型识别准确率达到了92.6%，且网络复杂度较低，在三维模型识别准确率和网络复杂度之间达到最佳平衡，优于当前主流方法。

关键词: 机器视觉, 三维模型识别, 图注意力卷积层, 卷积神经网络（CNN）, 选择性丢弃

DANG Jisheng, YANG Jun. 3D Model Recognition Based on Deep Graph Attention CNN[J]. Journal of Frontiers of Computer Science and Technology, 2021, 15(1): 141-149.

党吉圣, 杨军. 深度图注意力CNN的三维模型识别[J]. 计算机科学与探索, 2021, 15(1): 141-149.

References

[1] GAO Y, TANG J, HONG R, et al. Camera constraint-free view-based 3D object retrieval[J]. IEEE Transactions on Image Processing, 2011, 21(4): 2269-2281.
[2] CHEN D Y, TIAN X P, SHEN Y T, et al. On visual similarity based 3D model retrieval[J]. Computer Graphics Forum, 2003, 22(3): 223-232.
[3] SUN J, OVSJANIKOV M, GUIBAS L. A concise and provably informative multi-scale signature based on heat diffusion[J]. Computer Graphics Forum, 2009, 28(5): 1383-1392.
[4] OSADA R, FUNKHOUSER T, CHAZELLE B, et al. Shape distributions[J]. ACM Transactions on Graphics, 2002, 21(4): 807-832.
[5] SU H, MAJI S, KALOGERAKIS E, et al. Multi-view con-volutional neural networks for 3D shape recognition[C]//Proceedings of the 2015 IEEE International Conference on Computer Vision, Santiago, Dec 7-13, 2015. Washington: IEEE Computer Society, 2015: 945-953.
[6] FENG Y F, ZHANG Z Z, ZHAO X B, et al. GVCNN: group-view convolutional neural networks for 3D shape recogni-tion[C]//Proceedings of the 2018 IEEE Conference on Com-puter Vision and Pattern Recognition, Salt Lake, Jun 18-22, 2018. Washington: IEEE Computer Society, 2018: 264-272.
[7] MATURANA D, SCHERER S A. VoxNet: a 3D convolutional neural network for real-time object recognition[C]//Proceedings of the 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems, Hamburg, Sep 28-Oct 2, 2015. Piscat-away: IEEE, 2015: 922-928.
[8] YANG J, WANG Y M. 3D model recognition and classification based on deep convolution neural network[J]. Journal of Chong-qing University of Posts and Telecommunications (Natural Science Edition), 2019, 31(2): 253-260.
杨军, 王亦民. 基于深度卷积神经网络的三维模型识别[J]. 重庆邮电大学学报(自然科学版), 2019, 31(2): 253-260.
[9] YANG J, WANG S, ZHOU P. Recognition and classification for three-dimensional model based on deep voxel convolution neural network[J]. Acta Optica Sinica, 2019, 39(4): 306-316.
杨军, 王顺, 周鹏. 基于深度体素卷积神经网络的三维模型识别分类[J]. 光学学报, 2019, 39(4): 306-316.
[10] KLOKOV R, LEMPITSKY V S. Escape from cells: deep Kd-networks for the recognition of 3D point cloud models[C]//Proceedings of the 2017 IEEE International Conference on Computer Vision, Venice, Oct 22-29, 2017. Washington: IEEE Computer Society, 2017: 863-872.
[11] BRONSTEIN M, BRUNA J, LE C Y, et al. Geometric deep learning: going beyond Euclidean data[J]. IEEE Signal Pro-cessing Magazine, 2017, 34(4): 18-42.
[12] QI C R, YI L, SU H, et al. PointNet++: deep hierarchical feature learning on point sets in a metric space[C]//Procee-dings of the Annual Conference on Neural Information Pro-cessing Systems, Long Beach, Dec 4-9, 2017. Red Hook: Curran Associates, 2017: 5099-5108.
[13] WANG Y, SUN Y, LIU Z, et al. Dynamic graph CNN for learning on point clouds[J]. arXiv:1801.07829, 2018.
[14] ZHANG K, HAO M, WANG J, et al. Linked dynamic graph CNN: learning on point cloud via linking hierarchical features[J]. arXiv:1904.10014, 2019.
[15] CHEN C, FRAGONARA L, TSOURDOS A. GAPNet: graph attention based point neural network for exploiting local feature of point cloud[J]. arXiv:1905.08705, 2019.
[16] LIU X H, HAN Z Z, LIU Y S, et al. Point2sequence: learning the shape representation of 3D point clouds with an attention-based sequence to sequence network[C]//Proceedings of the 33rd AAAI Conference on Artificial Intelligence, the 31st Innovative Applications of Artificial Intelligence Conference, the 9th AAAI Symposium on Educational Advances in Artificial Intelligence, Honolulu, Jan 27-Feb 1, 2019. Menlo Park: AAAI, 2019: 8778-8785.
[17] LI Y Y, BU R, SUN M C, et al. PointCNN: convolution on x-transformed points[C]//Proceedings of the Annual Conference on Neural Information Processing Systems, Montréal, Dec 2-8, 2018. Red Hook: Curran Associates, 2018: 820-830.
[18] XU Y F, FAN T Q, XU M Y, et al. SpiderCNN: deep learning on point sets with parameterized convolutional filters[C]//LNCS 11212: Proceedings of the 15th European Conference on Computer Vision, Munich, Sep 8-14, 2018. Berlin, Heidelberg: Springer, 2018: 90-105.
[19] ATZMON M, MARON H, LIPMAN Y. Point convolutional neural networks by extension operators[J]. arXiv:1803.10091, 2018.
[20] ZHANG Y X, RABBAT M. A graph-CNN for 3D point cloud classification[C]//Proceedings of the 2018 IEEE International Conference on Acoustics, Speech and Signal Processing, Calgary, Apr 15-20, 2018. Piscataway: IEEE, 2018: 6279-6283.
[21] QI C, SU H, MO K, et al. PointNet: deep learning on point sets for 3D classification and segmentation[C]//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, Jul 21-26, 2017. Washington: IEEE Computer Society, 2017: 77-85.
[22] WAN L, ZEILER M D, ZHANG S X, et al. Regularization of neural networks using dropconnect[C]//Proceedings of the 30th International Conference on Machine Learning, Atlanta, Jun 16-21, 2013: 1058-1066.
[23] WU Z R, SONG S R, KHOSLA A, et al. 3D ShapeNets: a deep representation for volumetric shapes[C]//Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition, Boston, Jun 7-12, 2015. Washington: IEEE Com-puter Society, 2015: 1912-1920.