[1] 赵小蕾, 毛启容, 詹永照. 融合功能性副语言的语音情感识别新方法[J]. 计算机科学与探索, 2014, 8(2): 186-199.
ZHAO X L, MAO Q R, ZHAN Y Z. New method of speech emotion recognition fusing functional paralanguages[J]. Journal of Frontiers of Computer Science and Technology, 2014, 8(2): 186-199.
[2] 吴江, 黄茜, 贺超城, 等. 基于引爆点理论的人工智能生成内容微博网络舆情传播与演化分析[J]. 现代情报, 2023, 43(7): 145-161.
WU J, HUANG Q, HE C C, et al. Propagation and evolution of public opinion in the outbreak of AIGC based on the theory of tipping point[J]. Journal of Modern Information, 2023, 43(7): 145-161.
[3] 陶建华, 陈俊杰, 李永伟. 语音情感识别综述[J]. 信号处理, 2023, 39(4): 571-587.
TAO J H, CHEN J J, LI Y W. Review on speech emotion recognition[J]. Journal of Signal Processing, 2023, 39(4): 571-587.
[4] 黄鲁成, 薛爽. 机器学习技术发展现状与国际竞争分析[J]. 现代情报, 2019, 39(10): 165-176.
HUANG L C, XUE S. The development status and international competition analysis of machine learning[J]. Journal of Modern Information, 2019, 39(10): 165-176.
[5] WANKHADE M, RAO A C S, KULKARNI C. A survey on sentiment analysis methods, applications, and challenges[J]. Artificial Intelligence Review, 2022, 55(7): 5731-5780.
[6] 刘玉文, 刘月华, 杨枢, 等. 基于OTSCM模型的主题情感在线追踪[J]. 现代情报, 2017, 37(12): 35-41.
LIU Y W, LIU Y H, YANG S, et al. OTSCM approach for tracking on-line sentiment of topic[J]. Journal of Modern In-formation, 2017, 37(12): 35-41.
[7] 赵小明, 杨轶娇, 张石清. 面向深度学习的多模态情感识别研究进展[J]. 计算机科学与探索, 2022, 16(7): 1479-1503.
ZHAO X M, YANG Y J, ZHANG S Q. Survey of deep learning based multimodal emotion recognition[J]. Journal of Frontiers of Computer Science and Technology, 2022, 16(7): 1479-1503.
[8] 饶元, 吴连伟, 王一鸣, 等. 基于语义分析的情感计算技术研究进展[J]. 软件学报, 2018, 29(8): 2397-2426.
RAO Y, WU L W, WANG Y M, et al. Research progress on emotional computation technology based on semantic analysis[J]. Journal of Software, 2018, 29(8): 2397-2426.
[9] 赵永, 焦诗卉, 赵乾百. 基于Mel频谱和LSTM-DCNN的矿山微震信号混合识别模型[J]. 东北大学学报(自然科学版), 2023, 44(10): 1481-1489.
ZHAO Y, JIAO S H, ZHAO Q B. Hybrid recognition model of microseismic signals for mining based on Mel spectrum and LSTM-DCNN[J]. Journal of Northeastern University (Natural Science), 2023, 44(10): 1481-1489.
[10] GENE J, PARK S, SHIN H C, et al. Hybrid optical convolutional neural network with convolution kernels trained in the spatial domain[J]. Neurocomputing, 2024, 573: 127251.
[11] JIANG P X, FU H L, TAO H W, et al. Parallelized convolutional recurrent neural network with spectral features for speech emotion recognition[J]. IEEE Access, 2019, 7: 90368-90377.
[12] 李锦, 夏鸿斌, 刘渊. 基于BERT的双特征融合注意力的方面情感分析模型[J]. 计算机科学与探索, 2024, 18(1): 205-216.
LI J, XIA H B, LIU Y. Dual features local-global attention model with BERT for aspect sentiment analysis[J]. Journal of Frontiers of Computer Science and Technology, 2024, 18(1): 205-216.
[13] 彭凯贝, 孙小明, 陈皓炜, 等. 基于卷积神经网络的火车站语音情感识别方法[J]. 计算机仿真, 2023, 40(2): 177-180.
PENG K B, SUN X M, CHEN H W, et al. Railway station speech emotion recognition based on convolutional neural network[J]. Computer Simulation, 2023, 40(2): 177-180.
[14] HU Z F, LINGHU K H, YU H L, et al. Speech emotion recognition based on attention MCNN combined with gender information[J]. IEEE Access, 2023, 11: 50285-50294.
[15] 杨磊, 赵红东, 于快快. 基于多头注意力机制的端到端语音情感识别[J]. 计算机应用, 2022, 42(6): 1869-1875.
YANG L, ZHAO H D, YU K K. End-to-end speech emotion recognition based on multi-head attention[J]. Journal of Computer Applications, 2022, 42(6): 1869-1875.
[16] CHEN Z Z, LI J W, LIU H, et al. Learning multi-scale features for speech emotion recognition with connection attention mechanism[J]. Expert Systems with Applications, 2023, 214: 118943.
[17] LUNA-JIMéNEZ C, KLEINLEIN R, GRIOL D, et al. A proposal for multimodal emotion recognition using aural transformers and action units on RAVDESS dataset[J]. Applied Sciences, 2022, 12(1): 327.
[18] ONG K L, LEE C P, LIM H S, et al. Mel-MViTv2: enhanced speech emotion recognition with Mel spectrogram and improved multiscale vision transformers[J]. IEEE Access, 2023, 11: 108571-108579.
[19] AKHTAR M S, KUMAR A, GHOSAL D, et al. A multilayer perceptron based ensemble technique for fine-grained financial sentiment analysis[C]//Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. Stroudsburg: ACL, 2017: 540-546.
[20] 孙颖, 李泽, 张雪英. 基于约束式双通道模型的语音情感识别[J]. 东北大学学报(自然科学版), 2023, 44(11): 1537-1542.
SUN Y, LI Z, ZHANG X Y. Speech emotion recognition based on constrained bi-channel model[J]. Journal of Northea-stern University (Natural Science), 2023, 44(11): 1537-1542.
[21] ZHANG X Y, XU H Y, ZHU X Z, et al. Deep contrastive clustering via hard positive sample debiased [J]. Neurocomputing, 2024, 570: 127147.
[22] 刘振焘, 徐建平, 吴敏, 等. 语音情感特征提取及其降维方法综述[J]. 计算机学报, 2018, 41(12): 2833-2851.
LIU Z T, XU J P, WU M, et al. Review of emotional feature extraction and dimension reduction method for speech emotion recognition[J]. Chinese Journal of Computers, 2018, 41(12): 2833-2851.
[23] ZHANG T, FENG G, LIANG J, et al. Acoustic scene classification based on Mel spectrogram decomposition and model merging[J]. Applied Acoustics, 2021, 182: 108258.
[24] MENG H, YAN T H, YUAN F, et al. Speech emotion recognition from 3D Log-Mel spectrograms with deep learning network[J]. IEEE Access, 2019, 7: 125868-125881.
[25] CHOLLET F. Xception: deep learning with depthwise separable convolutions[C]//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2017: 1800-1807.
[26] HENDRYCKS D, GIMPEL K. Gaussian error linear units (GELUs)[EB/OL]. [2024-04-23]. https://arxiv.org/abs/1606. 08415.
[27] ZHANG H, WU C R, ZHANG Z Y, et al. ResNeSt: split-attention networks[C]//Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2022: 2735-2745.
[28] ZAMIL A A A, HASAN S, JANNATUL BAKI S M, et al. Emotion detection from speech signals using voting mechanism on classified frames[C]//Proceedings of the 2019 Inter-national Conference on Robotics, Electrical and Signal Processing Techniques. Piscataway: IEEE, 2019: 281-285.
[29] MUSTAQEEM, SAJJAD M, KWON S. Clustering-based speech emotion recognition by incorporating learned features and deep BiLSTM[J]. IEEE Access, 2020, 8: 79861-79875.
[30] ANVARJON T, MUSTAQEEM, KWON S. Deep-net: a light-weight CNN-based speech emotion recognition system using deep frequency features[J]. Sensors, 2020, 20(18): 5212.
[31] MUSTAQEEM, KWON S. Att-Net: enhanced emotion recognition system using lightweight self-attention module[J]. Applied Soft Computing, 2021, 102: 107101.
[32] GUIZZO E, WEYDE T, SCARDAPANE S, et al. Learning speech emotion representations in the quaternion domain[J]. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2023, 31: 1200-1212. |