计算机科学与探索 ›› 2022, Vol. 16 ›› Issue (8): 1764-1778.DOI: 10.3778/j.issn.1673-9418.2110049

• 综述·探索 • 上一篇    下一篇

表情识别技术综述

洪惠群1,2,3, 沈贵萍1,2,+(), 黄风华1,2,3   

  1. 1. 阳光学院 人工智能学院,福州 350015
    2. 阳光学院 空间数据挖掘与应用福建省高校工程研究中心,福州 350015
    3. 阳光学院 福建省空间信息感知与智能处理重点实验室,福州 350015
  • 收稿日期:2021-10-20 修回日期:2022-04-11 出版日期:2022-08-01 发布日期:2022-08-19
  • 通讯作者: +E-mail: 992774639@qq.com
  • 作者简介:洪惠群(1984—),女,福建南安人,硕士,讲师,工程师,CCF会员,主要研究方向为图像处理、计算机视觉、表情识别等。
    沈贵萍(1986—),女,博士,副教授,主要研究方向为模式识别与智能系统、表情识别、图像恢复、视频表达、音频处理等。
    黄风华(1982—),男,福建莆田人,博士,教授,硕士生导师,CCF会员,主要研究方向为机器学习、数据挖掘、遥感等。
  • 基金资助:
    国家自然科学基金(41501451);福建省自然科学基金(2019J01088);福建省自然科学基金(2019J01087)

Summary of Expression Recognition Technology

HONG Huiqun1,2,3, SHEN Guiping1,2,+(), HUANG Fenghua1,2,3   

  1. 1. College of Artificial Intelligence, Yango University, Fuzhou 350015, China
    2. Fujian University Engineering Research Center of Spatial Data Mining and Application, Yango University, Fuzhou 350015, China
    3. Fujian Key Laboratory of Spatial Information Perception and Intelligent Processing, Yango University, Fuzhou 350015, China
  • Received:2021-10-20 Revised:2022-04-11 Online:2022-08-01 Published:2022-08-19
  • About author:HONG Huiqun, born in 1984, M.S., lecturer, engineer, member of CCF. Her research inte-rests include image processing, computer vision, expression recognition, etc.
    SHEN Guiping, born in 1986, Ph.D., associate professor. Her research interests include pattern recognition and intelligent system, expression recognition, image restoration, video expression, audio processing, etc.
    HUANG Fenghua, born in 1982, Ph.D., professor, M.S. supervisor, member of CCF. His research interests include machine learning, data mining, remote sensing, etc.
  • Supported by:
    the National Natural Science Foundation of China(41501451);the Natural Science Foundation of Fujian Province(2019J01088);the Natural Science Foundation of Fujian Province(2019J01087)

摘要:

面部表情是判断人类情感和人机交互的重要依据,传统机器学习和深度学习的发展,给面部表情识别分析带来了许多机遇与挑战。首先分析了表情识别与情感分析的内在联系与区别,指出表情识别侧重于识别面部的表情及情感。接着总结归纳了基于单模态数据集和传统机器学习方法的表情识别技术及其优缺点,介绍了基于单模态数据集与深度学习方法的表情识别技术,然后指出基于单模态数据的表情识别技术具有一定的局限性,如:数据集在数量和质量上较为不足,识别准确率普遍不高,多停留在实验室研究阶段等。引出基于多模态数据集的表情识别及模态间融合方法,并介绍常用的多模态表情数据集,分析了基于多模态数据集的表情识别技术及模态之间的融合技术,包含特征级融合、决策级融合及混合融合三种方式。最后对表情识别分析技术进行总结与展望:考虑到数据集问题,可构建更多自然环境下的高质量表情数据集,也可结合姿势、脑电波等生理信号构建多模态数据集,利用GAN网络进行数据增强,关注微表情的提取,以及研究多模态融合算法等。

关键词: 人脸表情识别, 多模态, 模态融合

Abstract:

Facial expression is an important basis for judging human emotion and human-computer interaction. The development of traditional machine learning and deep learning has brought many opportunities and challenges to facial expression recognition and analysis. First, this paper analyzes the internal relationship and difference between expression recognition and emotion analysis, and points out that expression recognition focuses on identifying facial expression and emotion. Then, it summarizes the advantages and disadvantages of expression recognition tech-nology based on single mode dataset and traditional machine learning method, and introduces the expression recognition technology based on single mode dataset and deep learning method. Then, it points out that the expres-sion recognition technology based on single mode data has certain limitations, such as insufficient quantity and quality of datasets, generally low recognition accuracy, mostly staying in the laboratory research stage. Then, it introduces the expression recognition and inter-modal fusion methods based on multimodal datasets, and introduces the commonly used multimodal expression datasets. The expression recognition technology based on multimodal dataset and the fusion technology between modes are analyzed, including feature level fusion, decision level fusion and hybrid fusion. Finally, the expression recognition analysis technology is summarized and prospected: considering the problem of dataset, more high-quality expression datasets in natural environment can be constructed; multimodal data-sets can also be constructed combined with physiological signals such as posture and EEG; GAN network can be used to enhance data; pay attention to the extraction of micro-expression, and study multimodal fusion algorithm, etc.

Key words: facial expression recognition, multimodal, modal fusion

中图分类号: