Acoustic Features Extraction of Speech Enhancement Based on Auto-Encoder Feature

doi:10.3778/j.issn.1673-9418.1807003

Abstract

Abstract: In speech enhancement with supervised learning, feature extraction is a key step. Auditory features such as existing combined features and MRCG (multi-resolution cochleagram) are commonly used. Although the intelligibility of the enhanced speech based on these features is greatly improved, there is still a lot of noise in the enhanced speech and the quality (expressed as SNR) is low. In order to improve the quality after speech enhancement, without affecting the intelligibility, the IF (integrated feature) based on AEF (auto-encoder feature) is proposed. Firstly, the AEF is extracted by auto-encoder. Then, Group Lasso algorithm is used to verify the complementarity and redundancy between the AEF and the auditory features. These features are recombined to form IF. Finally, IF is used as input feature of speech enhancement system. Experiments are carried out on TIMIT corpus and Noisex-92 noise libraries. Compared with traditional speech enhancement methods, as well as deep learning methods using the existing combined features and MRCG as input features of speech enhancement system, the experimental results show that the speech effect of the proposed algorithm is greatly improved.

Key words: auto-encoder feature, deep neural network, feature extraction, signal noise ratio (SNR)

摘要： 利用监督性学习算法进行语音增强时，特征提取是至关重要的步骤。现有的组合特征和多分辨率特征等听觉特征是常用的声学特征，基于这些特征的增强语音虽然可懂度得到了较大提升，但是仍然残留大量噪声，语音质量（用信噪比衡量）很低。在不影响可懂度的情况下，为了提高语音增强后语音质量，提出了一种基于自编码特征的综合特征。首先利用自编码器提取自编码特征，然后利用Group Lasso算法验证自编码特征与听觉特征的互补性和冗余性，将特征重新组合得到综合特征，最后将综合特征作为语音增强系统的输入特征进行语音增强。在TIMIT语料库和Noisex-92噪声库上进行了仿真实验，结果表明，与传统的语音增强方法以及现有的组合特征和多分辨率特征分别作为语音增强系统输入特征的深度学习等方法相比，提出的增强算法的语音质量得到了较大提升。

关键词: 自编码特征, 深度神经网络, 特征提取, 信噪比

ZHANG Tao, REN Xiangying, LIU Yang, GENG Yanzhang. Acoustic Features Extraction of Speech Enhancement Based on Auto-Encoder Feature[J]. Journal of Frontiers of Computer Science and Technology, 2019, 13(8): 1341-1350.

张涛，任相赢，刘阳，耿彦章. 基于自编码特征的语音增强声学特征提取[J]. 计算机科学与探索, 2019, 13(8): 1341-1350.

[1]	LIU Liping, QIAO Lele, JIANG Liucheng. Overview of Image Denoising Methods [J]. Journal of Frontiers of Computer Science and Technology, 2021, 15(8): 1418-1431.
[2]	WU Xiaodong, LIU Jinghao, JIN Jie, MAO Siping. DNN Intrusion Detection Model Based on DT and PCA [J]. Journal of Frontiers of Computer Science and Technology, 2021, 15(8): 1450-1458.
[3]	SUN Yu, WEI Benzheng, LIU Chuan, ZHANG Kuixing, CONG Jinyu. Melting Reduction Auto-Encoder [J]. Journal of Frontiers of Computer Science and Technology, 2021, 15(8): 1526-1533.
[4]	SHEN Xueli, QIN Xinyu. KNN Algorithm of Enhanced Clustering Based on Density Canopy and Deep Feature [J]. Journal of Frontiers of Computer Science and Technology, 2021, 15(7): 1289-1301.
[5]	ZHAO Xiaoqiang, XU Huiping. Image Semantic Segmentation Method with Hierarchical Feature Fusion [J]. Journal of Frontiers of Computer Science and Technology, 2021, 15(5): 949-957.
[6]	SHI Zhicheng, ZHOU Yu. Method of Code Features Automated Extraction [J]. Journal of Frontiers of Computer Science and Technology, 2021, 15(3): 456-467.
[7]	LI Xiangxia, JI Xiaohui, LI Bin. Deep Learning Method for Fine-Grained Image Categorization [J]. Journal of Frontiers of Computer Science and Technology, 2021, 15(10): 1830-1842.
[8]	YAN Chunman,WANG Cheng. Development and Application of Convolutional Neural Network Model [J]. Journal of Frontiers of Computer Science and Technology, 2021, 15(1): 27-46.
[9]	XU Hui, ZHU Yuhua, ZHEN Tong, LI Zhihui. Survey of Image Semantic Segmentation Methods Based on Deep Neural Network [J]. Journal of Frontiers of Computer Science and Technology, 2021, 15(1): 47-59.
[10]	LI Wei, ZHAO Xiaole, DUAN Yanlong, LIU Lijun, HUANG Qingsong. Classification of Pulmonary Nodules Based on CNN Multi-level Second-order Feature Fusion [J]. Journal of Frontiers of Computer Science and Technology, 2020, 14(9): 1590-1601.
[11]	LIN Yang, CHU Xu, WANG Yasha, MAO Weijia, ZHAO Junfeng. Cross-Modal Recipe Retrieval with Self-Attention Mechanism [J]. Journal of Frontiers of Computer Science and Technology, 2020, 14(9): 1471-1481.
[12]	REN Jiadong, WANG Qian, WANG Fei, LI Yazhou, LIU Jiaxin. Automatic Classification of Computer Vulnerability Based on S-C Feature Extraction [J]. Journal of Frontiers of Computer Science and Technology, 2020, 14(7): 1173-1182.
[13]	CHENG Yusheng, LI Zhiwei, PANG Shufang. Multi-Label Feature Extraction Method Relied on Feature-Label Dependence Auto-encoder [J]. Journal of Frontiers of Computer Science and Technology, 2020, 14(3): 470-481.
[14]	SHENG Jiachuan, WANG Jiayuan, LI Yuzhi, WANG Jun. Improved Fast Generation of Superpixel Algorithms with Deep Network [J]. Journal of Frontiers of Computer Science and Technology, 2020, 14(12): 2132-2139.
[15]	LI Junjie, WANG Qian. Perceptually Similar Image Classification Adversarial Example Generation Model [J]. Journal of Frontiers of Computer Science and Technology, 2020, 14(11): 1930-1942.

Acoustic Features Extraction of Speech Enhancement Based on Auto-Encoder Feature

基于自编码特征的语音增强声学特征提取

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 15

Recommended Articles 0

Metrics