基于长时信号功率谱变化的语音端点检测

doi:10.3778/j.issn.1673-9418.1809029

计算机科学与探索 ›› 2019, Vol. 13 ›› Issue (9): 1534-1542.DOI: 10.3778/j.issn.1673-9418.1809029

基于长时信号功率谱变化的语音端点检测

张涛，刘阳，任相赢

天津大学电气自动化与信息工程学院，天津 300072

出版日期:2019-09-01 发布日期:2019-09-06

Voice Activity Detection Based on Long-Term Power Spectrum Variability

ZHANG Tao, LIU Yang, REN Xiangying

School of Electrical and Information Engineering, Tianjin University, Tianjin 300072, China

Online:2019-09-01 Published:2019-09-06

摘要/Abstract

摘要： 语音端点检测是语音信号处理的基础，为了提高在低信噪比及非平稳噪声下语音端点检测的准确性，提出了一种基于长时信号功率谱变化的语音特征，利用阈值判决法验证了这一特征在语音端点检测中的应用前景。该方法首先统计信号在长时段下功率谱的变化量；然后进行阈值判决，在初始化后可依据每次的判决结果自适应更新阈值；最后通过投票决策机制来判定当前是否为语音帧。仿真结果表明，与两种经典的基于长时特征（长时段信号变化率和长时段信号谱平坦度）的语音端点检测方法相比，所提方法在不同噪声环境及信噪比下，均具有更高的检测准确率，尤其在非平稳噪声条件下的检测效果提升明显，例如在机枪噪声环境下，平均检测准确率提高超过10%。

关键词: 语音端点检测, 长时信号频谱变化, 低信噪比, 非平稳噪声

Abstract: Voice activity detection is the basic work in speech signal processing. In order to improve the accuracy of voice activity detection in low signal-to-noise ratio (SNR) and nonstationary noise, a speech feature based on long-term power spectrum variability (LPSV) is proposed, and the application prospect of this feature in voice activity detection is tested by the threshold decision method. Firstly, the long-term power spectrum variability of the input signal is calculated. Then, a judgment is made with the initial threshold, and the threshold is updated adaptively according to the judgment result. Finally, whether current target frame is voice or not depends on the result of a voting mechanism. The simulation results show that compared with two classical algorithms using long-term feature (long-term signal variability, LTSV and long-term spectral flatness measure, LSFM), the proposed method can achieve higher accuracy in different noise and SNR conditions. Especially in the non-stationary noise environment, the accuracy of voice activity detection is significantly improved: in machine gun noise condition, the average accuracy increases more than 10%.

Key words: voice activity detection, long-term power spectrum variability (LPSV), low signal-to-noise ratio, nonstationary noise

张涛，刘阳，任相赢. 基于长时信号功率谱变化的语音端点检测[J]. 计算机科学与探索, 2019, 13(9): 1534-1542.

ZHANG Tao, LIU Yang, REN Xiangying. Voice Activity Detection Based on Long-Term Power Spectrum Variability[J]. Journal of Frontiers of Computer Science and Technology, 2019, 13(9): 1534-1542.

174

HTML			PDF

最新录用	在线预览	正式出版	最新录用	在线预览	正式出版
0	0	0	0	0	174

来源	本网站	其他网站

次数	169	5
比例	97%	3%

摘要

259

最新录用	在线预览	正式出版

0	0	259

	来源	本网站

	次数	259
	比例	100%

基于长时信号功率谱变化的语音端点检测

Voice Activity Detection Based on Long-Term Power Spectrum Variability

PDF

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 1

编辑推荐 0

Metrics