融合全局与局部特征的深度卷积神经网络算法

doi:10.3778/j.issn.1673-9418.2104106

计算机科学与探索 ›› 2022, Vol. 16 ›› Issue (5): 1146-1154.DOI: 10.3778/j.issn.1673-9418.2104106

融合全局与局部特征的深度卷积神经网络算法

程卫月¹^,⁺(), 张雪琴², 林克正², 李骜²

1.黑龙江工商学院,哈尔滨 150025
2.哈尔滨理工大学计算机科学与技术学院,哈尔滨 150080

收稿日期:2021-04-29 修回日期:2021-08-05 出版日期:2022-05-01 发布日期:2022-05-19
通讯作者: + E-mail: cheng_weiyue@sina.cn
作者简介:程卫月（1988—）,女,黑龙江哈尔滨人,硕士,讲师,主要研究方向为稀疏表示、图像处理、模式识别等。
张雪琴（1994—）,女,山西平定人,硕士研究生,主要研究方向为图像处理、模式识别等。
林克正（1962—）,男,山东蓬莱人,博士,教授,硕士生导师,主要研究方向为图像处理、机器视觉、模式识别等。
李骜（1986—）,男,黑龙江哈尔滨人,博士,副教授,博士生导师,主要研究方向为稀疏表示、图像复原、计算机视觉等。
基金资助:
国家自然科学基金(62071157);黑龙江省自然科学基金(F2015040);黑龙江省青年创新人才项目(UNPYSCT-2018203)

Deep Convolutional Neural Network Algorithm Fusing Global and Local Features

CHENG Weiyue¹^,⁺(), ZHANG Xueqin², LIN Kezheng², LI Ao²

1. Heilongjiang College of Business and Technology, Harbin 150025, China
2. School of Computer Science and Technology, Harbin University of Science and Technology, Harbin 150080, China

Received:2021-04-29 Revised:2021-08-05 Online:2022-05-01 Published:2022-05-19
About author:CHENG Weiyue, born in 1988, M.S., lecturer. Her research interests include sparse representation, image processing, pattern recognition, etc.
ZHANG Xueqin, born in 1994, M.S. candidate. Her research interests include image processing, pattern recognition, etc.
LIN Kezheng, born in 1962, Ph.D., professor, M.S. supervisor. His research interests include image processing, machine vision, pattern recognition, etc.
LI Ao, born in 1986, Ph.D., associate professor, Ph.D. supervisor. His research interests include sparse representation, image restoration, computer vision, etc.
Supported by:
National Natural Science Foundation of China(62071157);Natural Science Foundation of Heilongjiang Province(F2015040);University Nursing Program for Young Scholars with Creative Talents in Heilongjiang Province(UNPYSCT-2018203)

摘要/Abstract

摘要：

为进一步提高人脸表情识别的准确率,提出一种融合全局与局部特征的深度卷积神经网络算法（GL-DCNN）。该算法由两个改进的卷积神经网络分支组成,全局分支和局部分支,分别用于提取全局特征和局部特征,对两个分支的特征进行加权融合,使用融合后的特征进行分类。首先,提取全局特征,全局分支基于迁移学习,使用改进的VGG19网络模型进行特征提取;其次,提取局部特征,局部分支采用中心对称局部二值模式（CSLBP）算法进行第一次特征提取,得到原始图像的局部纹理信息,将其输入到浅层卷积神经网络进行第二次特征提取,使其自动提取出与表情相关的局部特征;再次,采用两个级联的全连接层对两个分支的特征进行降维,为其分配不同权重,进行加权融合;最后,采用softmax分类器进行分类。实验在CK+和JAFFE数据集上进行验证,分类精度分别达95%以上和93%以上,对比其他五种算法,该算法总体表现较好,具有较好的识别效果和良好的鲁棒性,可为人脸表情识别提供有效依据。

关键词: 表情识别, 特征融合, 卷积神经网络（CNN）, 深度学习

Abstract:

In order to further improve the accuracy of facial expression recognition, a deep convolutional neural network algorithm fusing global and local features (GL-DCNN) is proposed. The algorithm consists of two improved convolutional neural network branches, global branch and local branch, which are used to extract global features and local features respectively. The features of the two branches are weighted and fused, and the fused features are used for classification. Firstly, global features are extracted. The global branch is based on transfer learning, and the improved VGG19 network model is used for feature extraction. Secondly, local features are extracted. In the local branch, central symmetric local binary pattern (CSLBP) algorithm is used for the first feature extraction, and the local texture information of the original image is obtained, which is input into shallow convolutional neural network for the second feature extraction, so that the local features related to facial expressions are automatically extracted. Thirdly, two cascaded fully connected layers are used to reduce the dimension of the features of the two branches, and different weights are assigned to them for weighted fusion. Finally, softmax classifier is used for classification. The experiment is validated on CK+ and JAFFE datasets, and the classification accuracy is over 95% and 93%, respectively. Compared with other five algorithms, this algorithm has a good overall performance, good recognition effect and good robustness, which can provide an effective basis for facial expression recognition.

Key words: facial expression recognition, feature fusion, convolutional neural networks (CNN), deep learning

中图分类号:

TP391.4

程卫月, 张雪琴, 林克正, 李骜. 融合全局与局部特征的深度卷积神经网络算法[J]. 计算机科学与探索, 2022, 16(5): 1146-1154.

CHENG Weiyue, ZHANG Xueqin, LIN Kezheng, LI Ao. Deep Convolutional Neural Network Algorithm Fusing Global and Local Features[J]. Journal of Frontiers of Computer Science and Technology, 2022, 16(5): 1146-1154.

图/表 15

参考文献 24

[1]	YANG B, CAO J, NI R, et al. Facial expression recognition using weighted mixture deep neural network based on double-channel facial images[J]. IEEE Access, 2018, 6: 4630-4640. DOI URL
[2]	WU B F, LIN C H. Adaptive feature mapping for customizing deep learning based facial expression recognition model[J]. IEEE Access, 2018, 6: 12451-12461. DOI URL
[3]	ZHANG T, ZHENG W M, CUI Z, et al. A deep neural network-driven feature learning method for multi-view facial expression recognition[J]. IEEE Transactions on Multime-dia, 2018, 18(12): 2528-2536.
[4]	ZENG G, ZHOU J, JIA X, et al. Hand-crafted feature guided deep learning for facial expression recognition[C]// Proceedings of the 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition, Xi’an, May 15-19, 2018. Was-hington: IEEE Computer Society, 2018: 423-430.
[5]	WOO S, PARK J, LEE J Y, et al. CBAM: convolutional block attention module[C]// LNCS 11211: Proceedings of the 15th European Conference on Computer Vision, Munich, Sep 8-14, 2018. Cham: Springer, 2018: 564-574.
[6]	孙晓, 潘汀. 基于兴趣区域深度神经网络的静态面部表情识别[J]. 电子学报, 2017, 45(5): 1189-1197. DOI
	SUN X, PAN T. Static facial expression recognition system using ROI deep neural networks[J]. Acta Electronica Sinica, 2017, 45(5): 1189-1197.
[7]	WANG X JIN, LIU W, et al. Feature fusion of HOG and WLD for facial expression recognition[C]// Proceedings of the 2013 IEEE/SICE International Symposium on System Inte-gration, Kobe, Dec 15-17, 2013. Piscataway: IEEE, 2013: 227-232.
[8]	JIA Q, GAO X, GUO H, et al. Multi-layer sparse represen-tation for weighted LBP-patches based facial expression reco-gnition[J]. Sensors, 2015, 15(3): 6719. DOI URL
[9]	WANG L, LI R F, WANG K, et al. Feature representation for facial expression recognition based on FACS and LBP[J]. International Journal of Automation & Computing, 2014, 11(5): 459-468.
[10]	ZHOU J, ZHANG S, MEI H, et al. A method of facial exp-ression recognition based on Gabor and NMF[J]. Pattern Recognition and Image Analysis, 2016, 26(1): 119-124. DOI URL
[11]	SILWAL R, ALSADOON A, PRASAD P W C, et al. A novel deep learning system for facial feature extraction by fusing CNN and MB-LBP and using enhanced loss function[J]. Multimedia Tools and Applications, 2020, 79(1): 1-21. DOI URL
[12]	SHIN M, KIM M, KWON D S. Baseline CNN structure analysis for facial expression recognition[C]// Proceedings of the 25th IEEE International Symposium on Robot and Human Interactive Communication, New York, Aug 26-31, 2016. Piscataway: IEEE, 2016: 724-729.
[13]	CONNIE T, AL-SHABI M, CHEAH W P, et al. Facial ex-pression recognition using a hybrid CNN-SIFT aggregator[C]// LNCS 10607: Proceedings of the 11th International Work-shop on Multi-Disciplinary Trends in Artificial Intelligence, Gadong, Nov 20-22, 2017. Cham: Springer, 2017: 139-149.
[14]	ZOU G F, FU G X, GAO M L, et al. A new approach for small sample face recognition with pose variation by fusing gabor encoding features and deep features[J]. Multimedia Tools and Applications, 2020, 79(31): 23571-23598. DOI URL
[15]	SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition[J]. arXiv:1409.1556, 2014.
[16]	杨恢先, 张翡, 陈永, 等. 结合改进CSLBP和位平面分解的单样本人脸识别[J]. 激光与光电子学进展, 2018, 55(7): 221-228.
	YANG H X, ZHANG F, CHEN Y, et al. Single-sample face recognition combining improved CSLBP and bit-plane de-composition[J]. Progress in Laser and Optoelectronics, 2018, 55(7): 221-228.
[17]	PATIL V, SARODE T. Modified CSLBP[J]. International Journal of Electrical and Computer Engineering, 2019, 9(4):2950-2959.
[18]	胡渝苹. 基于HOG-CSLBP与深度学习的跨年龄人脸识别算法[J]. 西南师范大学学报(自然科学版), 2020, 45(3): 115-120.
	HU Y P. A cross-age face recognition algorithm based on HOG-CSLBP and deep learning[J]. Journal of Southwest China Normal University (Natural Science Edition), 2020, 45(3): 115-120.
[19]	董夙慧, 徐永刚. 基于HOG-CSLBP的快速行人检测算法[J]. 计算机工程与设计, 2018, 39(4): 1125-1129.
	DONG S H, XU Y G. Fast pedestrian detection algorithm based on HOG-CSLBP[J]. Computer Engineering and Design, 2018, 39(4): 1125-1129.
[20]	翟海庆, 刘丹, 刘晙. 利用双流卷积神经网络的人脸表情识别方法[J]. 光学技术, 2020, 46(6): 712-720.
	ZHAI H Q, LIU D, LIU J. A facial expression recognition method based on dual-stream convolutional neural network[J]. Optical Technique, 2020, 46(6): 712-720.
[21]	LUCEY P, COHN J F, KANADE T J, et al. The extended Cohn-Kanade dataset (CK+) a complete dataset for action unit and emotion-specified expression[C]// Proceedings of the 2010 IEEE Conference on Computer Vision and Pattern Re-cognition, San Francisco, Jun 13-18, 2010. Washington: IEEE Computer Society, 2010: 94-101.
[22]	林克正, 白婧轩, 李昊天, 等. 深度学习下融合不同模型的小样本表情识别[J]. 计算机科学与探索, 2020, 14(3): 482-492.
	LIN K Z, BAI J X, LI H T, et al. Facial expression recognition with small samples fused with different models under deep learning[J]. Journal of Frontiers of Computer Science and Technology, 2020, 14(3): 482-492.
[23]	LOPES A T, DEAGUIAR E, DESOUZA A F, et al. Facial expression recognition with convolutional neural networks: coping with few data and the training sample order[J]. Pattern Recognition, 2017, 61(1): 610-628. DOI URL
[24]	张俞晴, 何宁, 魏润辰. 基于卷积神经网络融合SIFT特征的人脸表情识别[J]. 计算机应用与软件, 2019, 36(11): 161-167.
	ZHANG Y Q, HE N, WEI R C. Face expression recog-nition based on convolutional neural networks fusing SIFT features[J]. Computer Applications and Software, 2019, 36(11): 161-167.

编辑推荐 0

Metrics

阅读次数

全文

300

HTML			PDF

最新录用	在线预览	正式出版	最新录用	在线预览	正式出版
0	0	41	0	0	259

来源	本网站	其他网站

次数	300	0
比例	100%	0%

摘要

940

最新录用	在线预览	正式出版

0	0	940

	来源	本网站

	次数	940
	比例	100%

网络层	输入大小	参数	输出大小
Conv-1	64×64×3	7×7×64-1-3	64×64×64
MaxPool-1	64×64×64	2×2-2-0	32×32×64
Dropout1	32×32×64	—	32×32×64
Conv-2	32×32×64	3×3×256-1-0	32×32×256
MaxPool-2	32×32×256	2×2-2-0	16×16×256
Dropout2	16×16×256	—	16×16×256
FC	1×500	—	1×500

网络层	输入大小	参数	输出大小
Conv-1	64×64×3	7×7×64-1-3	64×64×64
MaxPool-1	64×64×64	2×2-2-0	32×32×64
Dropout1	32×32×64	—	32×32×64
Conv-2	32×32×64	3×3×256-1-0	32×32×256
MaxPool-2	32×32×256	2×2-2-0	16×16×256
Dropout2	16×16×256	—	16×16×256
FC	1×500	—	1×500

网络层	输入大小	参数	输出大小
Conv-1	64×64×3	3×3×64-1-1	64×64×64
Conv-2	64×64×64	3×3×64-1-1	64×64×64
MaxPool-1	64×64×64	2×2-2-0	32×32×64
Dropout1	32×32×64	—	32×32×64
Conv-3	32×32×64	3×3×128-1-1	32×32×128
Conv-4	32×32×128	3×3×128-1-1	32×32×128
MaxPool-2	32×32×128	2×2-2-0	16×16×128
Dropout2	16×16×128	—	16×16×128
Conv-5	16×16×128	3×3×256-1-1	16×16×256
Conv-6	16×16×256	3×3×256-1-1	16×16×256
Conv-7	16×16×256	3×3×256-1-1	16×16×256
Conv-8	16×16×256	3×3×256-1-1	16×16×256
MaxPool-3	16×16×256	2×2-2-0	8×8×256
Dropout3	8×8×256	—	8×8×256
Conv-9	8×8×256	3×3×512-1-1	8×8×512
Conv-10	8×8×512	3×3×512-1-1	8×8×512
Conv-11	8×8×512	3×3×512-1-1	8×8×512
Conv-12	8×8×512	3×3×512-1-1	8×8×512
MaxPool-4	8×8×512	2×2-2-0	4×4×512
Dropout4	4×4×512	—	4×4×512
Conv-13	4×4×512	3×3×512-1-1	4×4×512
Conv-14	4×4×512	3×3×512-1-1	4×4×512
Conv-15	4×4×512	3×3×512-1-1	4×4×512
Conv-16	4×4×512	3×3×512-1-1	4×4×512
MaxPool-5	4×4×512	2×2-2-0	2×2×512
Dropout5	2×2×512	—	2×2×512
FC-1	2×2×512	—	500
Dropout6	1×500	—	1×500

网络层	输入大小	参数	输出大小
Conv-1	64×64×3	3×3×64-1-1	64×64×64
Conv-2	64×64×64	3×3×64-1-1	64×64×64
MaxPool-1	64×64×64	2×2-2-0	32×32×64
Dropout1	32×32×64	—	32×32×64
Conv-3	32×32×64	3×3×128-1-1	32×32×128
Conv-4	32×32×128	3×3×128-1-1	32×32×128
MaxPool-2	32×32×128	2×2-2-0	16×16×128
Dropout2	16×16×128	—	16×16×128
Conv-5	16×16×128	3×3×256-1-1	16×16×256
Conv-6	16×16×256	3×3×256-1-1	16×16×256
Conv-7	16×16×256	3×3×256-1-1	16×16×256
Conv-8	16×16×256	3×3×256-1-1	16×16×256
MaxPool-3	16×16×256	2×2-2-0	8×8×256
Dropout3	8×8×256	—	8×8×256
Conv-9	8×8×256	3×3×512-1-1	8×8×512
Conv-10	8×8×512	3×3×512-1-1	8×8×512
Conv-11	8×8×512	3×3×512-1-1	8×8×512
Conv-12	8×8×512	3×3×512-1-1	8×8×512
MaxPool-4	8×8×512	2×2-2-0	4×4×512
Dropout4	4×4×512	—	4×4×512
Conv-13	4×4×512	3×3×512-1-1	4×4×512
Conv-14	4×4×512	3×3×512-1-1	4×4×512
Conv-15	4×4×512	3×3×512-1-1	4×4×512
Conv-16	4×4×512	3×3×512-1-1	4×4×512
MaxPool-5	4×4×512	2×2-2-0	2×2×512
Dropout5	2×2×512	—	2×2×512
FC-1	2×2×512	—	500
Dropout6	1×500	—	1×500

算法	识别率/%
算法	CK+	JAFFE
人工特征提取	88.14	87.32
CNN+LBP	92.08	88.33
文献[23]	93.68	88.73
VGG16	95.00	90.86
文献[24]	95.40	92.10
GL-DCNN	95.51	93.01

融合全局与局部特征的深度卷积神经网络算法

Deep Convolutional Neural Network Algorithm Fusing Global and Local Features

RichHTML

PDF

可视化

摘要/Abstract

引用本文

使用本文

图/表 15

参考文献 24

相关文章 15

编辑推荐 0

Metrics

表情分类	CK+	JAFFE
Angry（生气）	45	31
Disgust（厌恶）	59	29
Fear（恐惧）	25	30
Happy（高兴）	69	32
Sad（悲伤）	28	30
Surprise（惊讶）	83	31

[1]	安凤平, 李晓薇, 曹翔. 权重初始化-滑动窗口CNN的医学图像分类[J]. 计算机科学与探索, 2022, 16(8): 1885-1897.
[2]	黄浩, 葛洪伟. 强化类间区分的深度残差表情识别网络[J]. 计算机科学与探索, 2022, 16(8): 1842-1849.
[3]	曾凡智, 许露倩, 周燕, 周月霞, 廖俊玮. 面向智慧教育的知识追踪模型研究综述[J]. 计算机科学与探索, 2022, 16(8): 1742-1763.
[4]	洪惠群, 沈贵萍, 黄风华. 表情识别技术综述[J]. 计算机科学与探索, 2022, 16(8): 1764-1778.
[5]	刘艺, 李蒙蒙, 郑奇斌, 秦伟, 任小广. 视频目标跟踪算法综述[J]. 计算机科学与探索, 2022, 16(7): 1504-1515.
[6]	赵小明, 杨轶娇, 张石清. 面向深度学习的多模态情感识别研究进展[J]. 计算机科学与探索, 2022, 16(7): 1479-1503.
[7]	彭豪, 李晓明. 多尺度选择金字塔网络的小样本目标检测算法[J]. 计算机科学与探索, 2022, 16(7): 1649-1660.
[8]	夏鸿斌, 肖奕飞, 刘渊. 融合自注意力机制的长文本生成对抗网络模型[J]. 计算机科学与探索, 2022, 16(7): 1603-1610.
[9]	赵运基, 范存良, 张新良. 融合多特征和通道感知的目标跟踪算法[J]. 计算机科学与探索, 2022, 16(6): 1417-1428.
[10]	孙方伟, 李承阳, 谢永强, 李忠博, 杨才东, 齐锦. 深度学习应用于遮挡目标检测算法综述[J]. 计算机科学与探索, 2022, 16(6): 1243-1259.
[11]	刘雅芬, 郑艺峰, 江铃燚, 李国和, 张文杰. 深度半监督学习中伪标签方法综述[J]. 计算机科学与探索, 2022, 16(6): 1279-1290.
[12]	童敢, 黄立波. Winograd快速卷积相关研究综述[J]. 计算机科学与探索, 2022, 16(5): 959-971.
[13]	钟梦圆, 姜麟. 超分辨率图像重建算法综述[J]. 计算机科学与探索, 2022, 16(5): 972-990.
[14]	裴利沈, 赵雪专. 群体行为识别深度学习方法研究综述[J]. 计算机科学与探索, 2022, 16(4): 775-790.
[15]	赵鹏飞, 谢林柏, 彭力. 融合注意力机制的深层次小目标检测算法[J]. 计算机科学与探索, 2022, 16(4): 927-937.

处理器	训练时间/s	识别率/%
CPU	130.00	91.69
GPU	2.00	92.71

处理器	训练时间/s	识别率/%
CPU	130.00	91.69
GPU	2.00	92.71