Journal of Frontiers of Computer Science and Technology ›› 2022, Vol. 16 ›› Issue (4): 909-916.DOI: 10.3778/j.issn.1673-9418.2105071
• Artificial Intelligence • Previous Articles Next Articles
BAO Guangbin, LI Gangle+(), WANG Guoxiong
Received:
2021-05-19
Revised:
2021-08-03
Online:
2022-04-01
Published:
2021-08-05
About author:
BAO Guangbin, born in 1975, Ph.D., associate professor. His research interests include big data analysis and natural language processing.Supported by:
通讯作者:
+ E-mail: 1450316716@qq.com作者简介:
包广斌(1975—),男,甘肃兰州人,博士,副教授,主要研究方向为大数据分析、自然语言处理。基金资助:
CLC Number:
BAO Guangbin, LI Gangle, WANG Guoxiong. Bimodal Interactive Attention for Multimodal Sentiment Analysis[J]. Journal of Frontiers of Computer Science and Technology, 2022, 16(4): 909-916.
包广斌, 李港乐, 王国雄. 面向多模态情感分析的双模态交互注意力[J]. 计算机科学与探索, 2022, 16(4): 909-916.
Add to citation manager EndNote|Ris|BibTeX
URL: http://fcst.ceaj.org/EN/10.3778/j.issn.1673-9418.2105071
参数 | 值 |
---|---|
词向量维度 | 50 |
BiGRU隐藏单元 | 300 |
全连接层神经元 | 100 |
Dropout | 0.5 |
学习率 | 0.001 |
批处理 | 32 |
迭代次数 | 30 |
优化函数 | Adam |
损失函数 | Categorical_crossentropy |
Table 1 Experimental parameter setting
参数 | 值 |
---|---|
词向量维度 | 50 |
BiGRU隐藏单元 | 300 |
全连接层神经元 | 100 |
Dropout | 0.5 |
学习率 | 0.001 |
批处理 | 32 |
迭代次数 | 30 |
优化函数 | Adam |
损失函数 | Categorical_crossentropy |
Model | Accuracy | |
---|---|---|
GME-LSTM[ | 76.50 | 73.40 |
MARN[ | 77.10 | 77.00 |
TFN[ | 77.10 | 77.90 |
Dialogue-RNN[ | 79.80 | 79.48 |
BC-LSTM[ | 80.30 | — |
Multilogue-Net[ | 81.19 | 80.10 |
Con-BIAM | 81.91 | 85.40 |
Table 2 Experimental results on MOSI dataset %
Model | Accuracy | |
---|---|---|
GME-LSTM[ | 76.50 | 73.40 |
MARN[ | 77.10 | 77.00 |
TFN[ | 77.10 | 77.90 |
Dialogue-RNN[ | 79.80 | 79.48 |
BC-LSTM[ | 80.30 | — |
Multilogue-Net[ | 81.19 | 80.10 |
Con-BIAM | 81.91 | 85.40 |
Metric | Dialogue-RNN[ | Multilogue-Net[ | Con-BIAM |
---|---|---|---|
T+A | 79.80 | 80.18 | 80.45 |
V+T | 78.90 | 80.06 | 80.98 |
A+V | 73.90 | 75.16 | 63.96 |
A+V+T | 79.80 | 81.19 | 81.91 |
Table 3 Accuracy of different models in bimodal and trimodal feature fusion %
Metric | Dialogue-RNN[ | Multilogue-Net[ | Con-BIAM |
---|---|---|---|
T+A | 79.80 | 80.18 | 80.45 |
V+T | 78.90 | 80.06 | 80.98 |
A+V | 73.90 | 75.16 | 63.96 |
A+V+T | 79.80 | 81.19 | 81.91 |
Metric | Dialogue-RNN[ | Multilogue-Net[ | Con-BIAM |
---|---|---|---|
T+A | 78.32 | 79.88 | 84.14 |
V+T | 78.12 | 79.84 | 84.43 |
A+V | 73.92 | 74.04 | 75.20 |
A+V+T | 79.48 | 80.10 | 85.40 |
Table 4 F 1scores of different models in bimodal and trimodal feature fusion %
Metric | Dialogue-RNN[ | Multilogue-Net[ | Con-BIAM |
---|---|---|---|
T+A | 78.32 | 79.88 | 84.14 |
V+T | 78.12 | 79.84 | 84.43 |
A+V | 73.92 | 74.04 | 75.20 |
A+V+T | 79.48 | 80.10 | 85.40 |
[1] | GHOSAL D, AKHTAR M S, CHAUHAN D S, et al. Contextual inter-modal attention for multi-modal sentiment analysis[C]// Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Oct 31-Nov 4, 2018. Stroudsburg: ACL, 2018: 3454-3466. |
[2] | 林敏鸿, 蒙祖强. 基于注意力神经网络的多模态情感分析[J]. 计算机科学, 2020, 47(S2):508-514. |
LIN M H, MENG Z Q. Multimodal sentiment analysis based on attention neural network[J]. Computer Science, 2020, 47(S2):508-514. | |
[3] | 刘继明, 张培翔, 刘颖, 等. 多模态的情感分析技术综述[J]. 计算机科学与探索, 2021, 15(7):1165-1182. |
LIU J M, ZHANG P X, LIU Y, et al. Summary of multi-modal sentiment analysis technology[J]. Journal of Frontiers of Computer Science and Technology, 2021, 15(7):1165-1182. | |
[4] | 何俊, 张彩庆, 李小珍, 等. 面向深度学习的多模态融合技术研究综述[J]. 计算机工程, 2020, 46(5):1-11. |
HE J, ZHANG C Q, LI X Z, et al. Survey of research on multimodal fusion technology for deep learning[J]. Computer Engineering, 2020, 46(5):1-11. | |
[5] | PORIA S, CAMBRIA E, HAZARIKA D, et al. Multi-level multiple attentions for contextual multimodal sentiment analysis[C]// Proceedings of the 2017 IEEE International Conference on Data Mining, New Orleans, Nov 18-21, 2017. Washington: IEEE Computer Society, 2017: 1033-1038. |
[6] | KUMAR A, VEPA J. Gated mechanism for attention based multi modal sentiment analysis[C]// Proceedings of the 2020 IEEE International Conference on Acoustics, Speech and Signal Processing, Barcelona, May 4-8, 2020. Piscataway:IEEE, 2020: 4477-4481. |
[7] | LIN Z, FENG M, SANTOS C N, et al. A structured self-attentive sentence embedding[J]. arXiv: 1703. 03130, 2017. |
[8] |
ZADEH A, ZELLERS R, PINCUS E, et al. Multimodal senti-ment intensity analysis in videos: facial gestures and verbal messages[J]. IEEE Intelligent Systems, 2016, 31(6):82-88.
DOI URL |
[9] | CHEN M H, WANG S, LIANG P P, et al. Multimodal sentiment analysis with word-level fusion and reinforcement learning[C]// Proceedings of the 19th ACM International Conference on Multimodal Interaction, Glasgow, Nov 13-17, 2017. New York: ACM, 2017: 163-171. |
[10] | 张亚洲, 戎璐, 宋大为, 等. 多模态情感分析研究综述[J]. 模式识别与人工智能, 2020, 33(5):426-438. |
ZHANG Y Z, RONG L, SONG D W, et al. A survey on multimodal sentiment analysis[J]. Pattern Recognition and Artificial Intelligence, 2020, 33(5):426-438. | |
[11] | ZADEH A, CHEN M H, PORIA S, et al. Tensor fusion network for multimodal sentiment analysis[C]// Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, Copenhagen, Sep 9-11, 2017. Stroudsburg: ACL, 2017: 1103-1114. |
[12] | ZADEH A, LIANG P P, PORIA S, et al. Multimodal language analysis in the wild: CMU-MOSEI dataset and interpretable dynamic fusion graph[C]// Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, Melbourne, Jul 15-20, 2018. Stroudsburg: ACL, 2018: 2236-2246. |
[13] | PORIA S, CAMBRIA E, HAZARIKA D, et al. Context-dependent sentiment analysis in user-generated videos[C]// Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, Vancouver, Jul 30-Aug 4, 2017. Stroudsburg: ACL, 2017: 873-883. |
[14] | MAJUMDER N, PORIA S, HAZARIKA D, et al. Dialogue: an attentive RNN for emotion detection in conversations[C]// Proceedings of the 2019 AAAI Conference on Artificial Intelligence, Honolulu, Jan 27-Feb 1, 2019. Palo Alto: AAAI, 2019: 6818-6825. |
[15] | SHENOY A, SARDANA A. Multilogue-Net: a context aware RNN for multi-modal emotion detection and sentiment analysis in conversation[C]// Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Seattle, Jul 5-10, 2020. Stroudsburg: ACL, 2020: 19-28. |
[16] | KIM T, LEE B. Multi-attention multimodal sentiment analysis[C]// Proceedings of the 2020 International Conference on Multimedia Retrieval, Dublin, Jun 8-11, 2020. New York: ACM, 2020: 436-441. |
[17] | ZADEH A, LIANG P P, PORIA S, et al. Multi-attention recurrent network for human communication comprehension[C]// Proceedings of the 2018 AAAI Conference on Artificial Intelligence, New Orleans, Feb 2-7, 2018. Palo Alto: AAAI, 2018: 5642-5649. |
[18] | XI C, LU G M, YAN J J. Multimodal sentiment analysis based on multi-head attention mechanism[C]// Proceedings of the 4th International Conference on Machine Learning and Soft Computing, Haiphong City, Jan 17-19, 2020. New York: ACM, 2020: 34-39. |
[19] | VERMA S, WANG J W, GE Z F, et al. Deep-HOSeq: deep higher order sequence fusion for multimodal sentiment analysis[J]. arXiv: 2010. 08218, 2020. |
[20] | TACHIBANA H, UENOYAMA K, AIHARA S. Efficiently trainable text-to-speech system based on deep convolutional networks with guided attention[C]// Proceedings of the 2018 IEEE International Conference on Acoustics, Speech and Signal Processing, Calgary, Apr 15-20, 2018. Piscataway: IEEE, 2018: 4784-4788. |
[21] | EYBEN F, WÖLLMER M, SCHULLER B W. Opensmile: the munich versatile and fast open-source audio feature extractor[C]// Proceedings of the 18th ACM International Conference on Multimedia, Firenze, Oct 25-29, 2010. New York: ACM, 2010: 1459-1462. |
[1] | YANG Zhiqiao, ZHANG Ying, WANG Xinjie, ZHANG Dongbo, WANG Yu. Application Research of Improved U-shaped Network in Detection of Retinopathy [J]. Journal of Frontiers of Computer Science and Technology, 2022, 16(8): 1877-1884. |
[2] | HONG Huiqun, SHEN Guiping, HUANG Fenghua. Summary of Expression Recognition Technology [J]. Journal of Frontiers of Computer Science and Technology, 2022, 16(8): 1764-1778. |
[3] | ZHAO Xiaoming, YANG Yijiao, ZHANG Shiqing. Survey of Deep Learning Based Multimodal Emotion Recognition [J]. Journal of Frontiers of Computer Science and Technology, 2022, 16(7): 1479-1503. |
[4] | PENG Hao, LI Xiaoming. Multi-scale Selection Pyramid Networks for Small-Sample Target Detection Algorithms [J]. Journal of Frontiers of Computer Science and Technology, 2022, 16(7): 1649-1660. |
[5] | LIU Ying, WANG Zhe, FANG Jie, ZHU Tingge, LI Linna, LIU Jiming. Multi-modal Public Opinion Analysis Based on Image and Text Fusion [J]. Journal of Frontiers of Computer Science and Technology, 2022, 16(6): 1260-1278. |
[6] | ZHAO Yunji, FAN Cunliang, ZHANG Xinliang. Object Tracking Algorithm with Fusion of Multi-feature and Channel Awareness [J]. Journal of Frontiers of Computer Science and Technology, 2022, 16(6): 1417-1428. |
[7] | CHENG Weiyue, ZHANG Xueqin, LIN Kezheng, LI Ao. Deep Convolutional Neural Network Algorithm Fusing Global and Local Features [J]. Journal of Frontiers of Computer Science and Technology, 2022, 16(5): 1146-1154. |
[8] | ZHAO Pengfei, XIE Linbo, PENG Li. Deep Small Object Detection Algorithm Integrating Attention Mechanism [J]. Journal of Frontiers of Computer Science and Technology, 2022, 16(4): 927-937. |
[9] | ZHOU Yinfeng, LI Jinjin. Skill Reduction and Assessment in Formal Context [J]. Journal of Frontiers of Computer Science and Technology, 2022, 16(3): 692-702. |
[10] | WANG Yanni, YU Lixian. SSD Object Detection Algorithm with Effective Fusion of Attention and Multi-scale [J]. Journal of Frontiers of Computer Science and Technology, 2022, 16(2): 438-447. |
[11] | XIAO Zeguan, CHEN Qingliang. Aspect-Based Sentiment Analysis Model with Multiple Grammatical Information [J]. Journal of Frontiers of Computer Science and Technology, 2022, 16(2): 395-402. |
[12] | LI Zhixin, CHEN Shengjia, ZHOU Tao, MA Huifang. Combining Cascaded Network and Adversarial Network for Object Detection [J]. Journal of Frontiers of Computer Science and Technology, 2022, 16(1): 217-230. |
[13] | QIAN Wu, WANG Guozhong, LI Guoping. Improved YOLOv5 Traffic Light Real-Time Detection Robust Algorithm [J]. Journal of Frontiers of Computer Science and Technology, 2022, 16(1): 231-241. |
[14] | CHEN Fan, PENG Li. Person Re-identification Based on Multi-level Feature Fusion with Overlapping Stripes [J]. Journal of Frontiers of Computer Science and Technology, 2021, 15(9): 1753-1761. |
[15] | WU Jiawei, SUN Yanchun. Recommendation System for Medical Consultation Integrating Knowledge Graph and Deep Learning Methods [J]. Journal of Frontiers of Computer Science and Technology, 2021, 15(8): 1432-1440. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||
/D:/magtech/JO/Jwk3_kxyts/WEB-INF/classes/