融合注意力机制和课程式学习的人脸识别方法

doi:10.3778/j.issn.1673-9418.2209111

摘要/Abstract

摘要： 针对当前人脸识别算法中提取的人脸特征可区分性不强、难易样本区分度不够的问题，提出一种融合注意力机制和课程式学习的人脸识别算法（ECACFace）。该算法提出一种高效的空间通道注意力模块（ESCA）并将其融入特征提取网络的基本模块中，使用高效的通道注意力模块（ECA）获取通道关注度并在ECA之后加入空间注意力模块，在关注图像通道信息的基础之上进一步获取空间关注度，从而得到信息更加丰富的人脸特征向量用于人脸分类。同时在训练时引入基于课程式学习的损失函数，做到在训练过程中区分难易样本，并在前期着重训练简单样本，后期着重训练困难样本，实现有区分度的样本学习。在CASIA-WebFace数据集上训练基于轻量级网络和浅层网络的ECACFace，与原始网络相比有超过1.5个百分点的精度提升。在百万规模的MS1MV2数据集上训练基于深层网络的ECACFace，在CPLFW数据集上比ArcFace精度提升了1.14个百分点，实验结果表明，融合ESCA模块和基于课程式学习的损失函数能够进一步提升人脸识别性能。

关键词: 人脸识别, 特征提取, 课程式学习, 注意力机制

Abstract: Aiming at the problems that the facial features extracted from current face recognition algorithms are not distinguishable and the discrimination of difficult and easy samples is not enough, a face recognition algorithm combining attention mechanism and curriculum learning is proposed, which is called efficient cooperative attention and curriculum face (ECACFace). The algorithm proposes an efficient spatial channel attention module (ESCA) and integrates it into the basic module of the feature extraction network. The efficient channel attention module (ECA) is used to obtain the channel attention, and the spatial attention module is added after the ECA. On the basis of paying attention to the image channel information, the spatial attention is further obtained, and the face feature vector with richer information is obtained for face classification. At the same time, the loss function based on curriculum learning is introduced to distinguish the difficult and easy samples in the training process. The simple samples are trained in the early stage and the difficult samples are trained in the later stage to realize the discriminative sample learning. Training ECACFace based on lightweight network and shallow network on CASIA-WebFace dataset and it has an accuracy improvement of more than 1.5 percentage points compared with the original network. ECACFace based on deep network is trained on MS1MV2 dataset which has millions of data, and the accuracy tested on CPLFW dataset is increased by 1.14 percentage points compared with ArcFace. Experimental results show that the cooperation of ESCA module and the loss function based on curriculum learning can further improve the perfor-mance of face recognition.

Key words: face recognition, feature extraction, curriculum learning, attention mechanism

王海勇, 潘海涛, 刘贵楠. 融合注意力机制和课程式学习的人脸识别方法[J]. 计算机科学与探索, 2023, 17(8): 1893-1903.

WANG Haiyong, PAN Haitao, LIU Guinan. Face Recognition Method Based on Attention Mechanism and Curriculum Learning[J]. Journal of Frontiers of Computer Science and Technology, 2023, 17(8): 1893-1903.

参考文献

[1] MASI I, WU Y, HASSNER T, et al. Deep face recognition: a survey[C]//Proceedings of the 31st SIBGRAPI Conference on Graphics, Patterns and Images, Parana, Oct 29-Nov 1, 2018. Piscataway: IEEE, 2018: 471-478.
[2] KRIZHEVSKY A, SUTSKEVER I, HINTON G E. ImageNet classification with deep convolutional neural networks[J]. Communications of the ACM, 2017, 60(6): 84-90.
[3] SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition[C]//Proceedings of the 3rd International Conference on Learning Represen-tations, San Diego, May 7-9, 2015: 1-17.
[4] SZEGEDY C, LIU W, JIA Y, et al. Going deeper with convolutions[C]//Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition, Boston, Jun 7, 2015. Washington: IEEE Computer Society, 2015: 1-9.
[5] HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition[C]//Proceedings of the 2016 IEEE Con-ference on Computer Vision and Pattern Recognition, Las Vegas, Jun 26-Jul 1, 2016. Washington: IEEE Computer Society, 2016: 770-778.
[6] HUANG G, LIU Z, VAN DER MAATEN L, et al. Densely connected convolutional networks[C]//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, Jul 21-26, 2017. Washington: IEEE Computer Society, 2017: 4700-4708.
[7] WANG Q, WU B, ZHU P, et al. ECA-Net: efficient channel attention for deep convolutional neural networks[C]//Pro-ceedings of the 2020 IEEE Conference on Computer Vision and Pattern Recognition, Washington, Jun 14-19, 2020. Wa-shington: IEEE Computer Society, 2020: 11531-11539.
[8] DENG J, GUO J, XUE N, et al. ArcFace: additive angular margin loss for deep face recognition[C]//Proceedings of the 2019 IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, Jun 16-20, 2019. Piscataway: IEEE, 2019: 4690-4699.
[9] WANG X, ZHANG S, WANG S, et al. Mis-classified vector guided softmax loss for face recognition[C]//Proceedings of the 34th AAAI Conference on Artificial Intelligence, the 32nd Innovative Applications of Artificial Intelligence Conference, the 10th AAAI Symposium on Educational Advances in Artificial Intelligence, New York, Feb 7-12, 2020. Menlo Park: AAAI, 2020: 12241-12248.
[10] HUANG Y, WANG Y, TAI Y, et al. CurricularFace: adaptive curriculum learning loss for deep face recognition[C]//Pro-ceedings of the 2020 IEEE Conference on Computer Vision and Pattern Recognition, Washington, Jun 14-19, 2020. Wa-shington: IEEE Computer Society, 2020: 5901-5910.
[11] HU J, SHEN L, SUN G. Squeeze-and-excitation networks[C]//Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, Jun 18-22, 2018. Washington: IEEE Computer Society, 2018: 7132-7141.
[12] LI X, WANG W, HU X, et al. Selective kernel networks[C]//Proceedings of the 2019 IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, Jun 16-20, 2019. Washington: IEEE Computer Society, 2019: 510-519.
[13] PARK J, WOO S, LEE J Y, et al. BAM: bottleneck atten-tion module[C]//Proceedings of the 29th British Machine Vision Conference, Newcastle, Sep 3-6, 2018. Britain: BMVA, 2019: 1-14.
[14] WOO S, PARK J, LEE J Y, et al. CBAM: convolutional block attention module[C]//LNCS 11211: Proceedings of the 15th European Conference on Computer Vision, Munich, Sep 8-14, 2018. Cham: Springer, 2018: 3-19.
[15] LI X, HU X, YANG J. Spatial group-wise enhance: impro-ving semantic feature learning in convolutional networks[J]. arXiv:1905.09646, 2019.
[16] 罗思诗, 李茂军, 陈满. 多尺度融合注意力机制的人脸表情识别网络[J]. 计算机工程与应用, 2023, 59(1): 199-206.
LUO S S, LI M J, CHEN M. Multi-scale integrated atten-tion mechanism for facial expression recognition network[J]. Computer Engineering and Applications, 2023, 59(1): 199-206.
[17] 张宏鸣, 周利香, 李永恒, 等. 基于改进MobileFaceNet的羊脸识别方法[J]. 农业机械学报, 2022, 53(5): 267-274.
ZHANG H M, ZHOU L X, LI Y H, et al. Sheep face recognition method based on improved MobileFaceNet[J]. Transactions of the Chinese Society of Agricultural Ma-chinery, 2022, 53(5): 267-274.
[18] CHEN S, LIU Y, GAO X, et al. MobileFaceNets: efficient CNNs for accurate real-time face verification on mobile devices[C]//LNCS 10996: Proceedings of the 2018 Chinese Conference on Biometric Recognition, Urumqi, Aug 11-12, 2018. Cham: Springer, 2018: 428-438.
[19] SCHROFF F, KALENICHENKO D, PHILBIN J. FaceNet: a unified embedding for face recognition and clustering[C]//Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition, Boston, Jun 7-12, 2015.Washington: IEEE Computer Society, 2015: 815-823.
[20] LIU W, WEN Y, YU Z, et al. SphereFace: deep hypersphere embedding for face recognition[C]//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, Jul 21-26, 2017. Washington: IEEE Computer Society, 2017: 212-220.
[21] WANG H, WANG Y, ZHOU Z, et al. CosFace: large margin cosine loss for deep face recognition[C]//Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, Jun 18-22, 2018. Washington:IEEE Computer Society, 2018: 5265-5274.
[22] ZHANG K, ZHANG Z, LI Z, et al. Joint face detection and alignment using multitask cascaded convolutional networks[J]. IEEE Signal Processing Letters, 2016, 23(10): 1499-1503.
[23] LI B, LIU Y, WANG X. Gradient harmonized single-stage detector[C]//Proceedings of the 33rd AAAI Conference on Artificial Intelligence, the 31st Innovative Applications of Artificial Intelligence Conference, the 9th AAAI Symposium on Educational Advances in Artificial Intelligence, Honolulu, Jan 27-Feb 1, 2019. Menlo Park: AAAI, 2019: 8577-8584.
[24] MENG Q, ZHAO S, HUANG Z, et al. MagFace: a universal representation for face recognition and quality assessment[C]//Proceedings of the 2021 IEEE Conference on Computer Vision and Pattern Recognition, Jun 19-25, 2021. Washington: IEEE Computer Society, 2021: 14225-14234.