Statistical Analysis of Local Gradient in Mel Transform Domain for Parkinson’s Dysphonia

doi:10.3778/j.issn.1673-9418.2102055

Journal of Frontiers of Computer Science and Technology ›› 2022, Vol. 16 ›› Issue (10): 2345-2356.DOI: 10.3778/j.issn.1673-9418.2102055

• Artificial Intelligence • Previous Articles Next Articles

Statistical Analysis of Local Gradient in Mel Transform Domain for Parkinson’s Dysphonia

ZHANG Tao¹^,²^,⁺(), LIN Liqin¹^,², ZHANG Yajuan¹^,², NIU Xiaoxia¹^,²

1. School of Information Science and Engineering, Yanshan University, Qinhuangdao, Hebei 066004, China
2. Hebei Key Laboratory of Information Transmission and Signal Processing, Yanshan University, Qinhuangdao, Hebei 066004, China

Received:2021-02-24 Revised:2021-05-20 Online:2022-10-01 Published:2021-05-31
About author:ZHANG Tao, born in 1979, Ph.D., associate professor, member of CCF. His research interests include medical signal processing, machine learning, concept-cognitive learning, etc.
LIN Liqin, born in 1997, M.S. candidate. Her research interests include speech signal analysis, machine learning, etc.
ZHANG Yajuan, born in 1993, M.S. candidate. Her research interests include medical signal processing, machine learning, etc.
NIU Xiaoxia, born in 1981, Ph.D., senior experi-mentalist. Her research interests include intelligent signal processing, machine learning, etc.
Supported by:
National Natural Science Foundation of China(61971374);Natural Science Foundation of Hebei Province(F2020203010)

帕金森语音障碍的Mel变换域局部梯度统计分析

张涛¹^,²^,⁺(), 林丽琴¹^,², 张亚娟¹^,², 牛晓霞¹^,²

1.燕山大学信息科学与工程学院,河北秦皇岛 066004
2.燕山大学河北省信息传输与信号处理重点实验室,河北秦皇岛 066004

通讯作者: + E-mail: zhtao@ysu.edu.cn
作者简介:张涛（1979—）,男,河北秦皇岛人,博士,副教授,CCF会员,主要研究方向为医学信息处理、机器学习、概念认知学习等。
林丽琴（1997—）,女,山东烟台人,硕士研究生,主要研究方向为语音信号分析、机器学习等。
张亚娟（1993—）,女,河北保定人,硕士研究生,主要研究方向为医学信息处理、机器学习等。
牛晓霞（1981—）,女,河北保定人,博士,高级实验师,主要研究方向为智能信号处理、机器学习等。
基金资助:
国家自然科学基金(61971374);河北省自然科学基金(F2020203010)

Abstract

Abstract:

Dysphonia analysis of Parkinson’s disease is the basis of information analysis for early diagnosis of Parkinson’s disease based on speech. In recent years, with the deepening of research, Mel transform domain information shows more and more advantages in this field. At the same time, the improvement of classification performance by extracting structural features is increasingly apparent. This paper proposes a method for local gradient statistical feature extraction in Mel transform domain from the point of the structure of Mel transform domain information of speech signals of people with Parkinson’s disease. Firstly, the speech signal is converted into the energy signal in the time-frequency transform domain by the method of Mel frequency transformation, and the energy spectrum is represented visually. Then, the energy data are processed by sliding window, and the local structure information of the Mel transform domain is obtained by calculating the gradient and angle of each energy point in the detection window. Finally, the gradients of the energy points of all detection windows are calculated according to the angles to obtain the local gradient statistical features, which represent the change of energy value in Mel transform domain. The results of the experiments performed on different datasets by different classifiers show that compared with the methods of Mel transform domain analysis, cepstrum analysis and deep learning, the local gradient statistical features in Mel transform domain are superior to them in classification accuracy and sensitivity, thereby verifying the effectiveness of the local gradient statistical feature in the dysphonia analysis of Parkinson’s disease.

Key words: Parkinson’s disease, dysphonia, Mel transform domain, local gradient statistics

摘要：

帕金森病语音障碍分析是进行基于语音的帕金森病早期诊断的信息分析基础。近年来,随着研究的深入,Mel变换域信息在本领域表现出越来越多的优势,同时提取结构特征对分类性能的提升日益显现。从帕金森病人语音信号的Mel变换域信息结构出发,提出Mel变换域局部梯度统计特征提取方法。该方法首先通过Mel频率变换的方法将语音信号转化为时频变换域能量信号,并将能量谱进行可视化表示;其次对能量数据进行滑动窗口处理,计算检测窗口内每个能量点的梯度与角度,获得Mel变换域的局部结构信息;最后根据角度统计所有检测窗口能量点的梯度,从而得到整体的局部梯度统计特征,以此表示Mel变换域中能量值的变化情况。在不同的帕金森病语音数据集上利用不同分类器进行实验,实验结果表明,与Mel变换域分析、倒谱分析和深度学习等方法相比,所提算法具有高准确度、高灵敏性的特点,从而验证了提出的局部梯度统计特征在帕金森语音障碍分析中的有效性。

关键词: 帕金森病, 语音障碍, Mel变换域, 局部梯度统计

CLC Number:

TP391

ZHANG Tao, LIN Liqin, ZHANG Yajuan, NIU Xiaoxia. Statistical Analysis of Local Gradient in Mel Transform Domain for Parkinson’s Dysphonia[J]. Journal of Frontiers of Computer Science and Technology, 2022, 16(10): 2345-2356.

张涛, 林丽琴, 张亚娟, 牛晓霞. 帕金森语音障碍的Mel变换域局部梯度统计分析[J]. 计算机科学与探索, 2022, 16(10): 2345-2356.

Figures/Tables 15

Fig.1 Flowchart of Mel frequency domain transformation

Fig.2 Comparison of speech in time domain, frequency domain and Mel transform domain between healthy people and patients with PD

Fig.3 Flowchart for SFLG feature extraction

Fig.4 Schematic diagram of time-frequency angle direction in transform domain

Fig.5 SFLG extraction visualization diagram

Fig.6 Schematic diagram of energy distribution in direction of point to be detected

Table 1 Comparison of SPDD and CPPDD datasets

数据集名称	SPDD	CPPDD
采集方式	持续发音	持续发音
采样频率/kHz	44.1	44.1
样本数据	534(282患病)	918(495患病)
患者/健康比例	20∶20	36∶32
用药情况	未提供	有记录

Fig.7 Comparison of accuracy of SVM with different kernel functions and feature dimensions

Fig.8 Comparison of accuracy of KNN with different K values and feature dimensions

Table 2 SFLG optimal parameters of SVM classifier

核函数	数据集	特征维度	AC/%
Gaussian	SPDD	5	98.77
Gaussian	CPPDD	12	92.02

Table 3 SFLG optimal parameters of KNN classifier

$K$ 值	数据集	特征维度	AC/%
1	SPDD	3	96.50
3	CPPDD	11	92.62

Table 3 SFLG optimal parameters of KNN classifier

$K$ 值	数据集	特征维度	AC/%
1	SPDD	3	96.50
3	CPPDD	11	92.62

Table 4 Accuracy for SPDD and CPPDD datasets 单位：%

分类器	SPDD数据集	CPPDD数据集
分类器	训练集	测试集	训练集	测试集
SVM	98.13	98.03	92.19	92.10
KNN	97.06	96.86	92.32	92.45

Table 5 Cross validation classification accuracy

分类器	SPDD训练	CPPDD测试	CPPDD训练	SPDD测试
SVM	95.70	64.41	92.35	64.79
KNN	94.15	57.08	92.03	55.81

Table 6 Classification accuracy of cross validation in SPDD and CPPDD datasets

分类器	验证方法	SPDD数据集			CPPDD数据集
分类器	验证方法	AC	SE	SP	AC	SE	SP
SVM	5折	97.53	97.61	96.05	92.17	93.46	90.34
	10折	97.81	97.88	96.61	92.28	94.13	90.43
	留一	97.22	97.29	97.02	92.93	92.01	91.84
KNN	5折	96.85	96.23	94.03	91.14	96.75	84.53
	10折	97.85	97.98	96.69	92.04	96.75	87.72
	留一	96.63	97.70	89.09	90.69	95.51	85.89

Table 7

方法	分类器	SPDD数据集			CPPDD数据集
方法	分类器	AC	SE	SP	AC	SE	SP
MFCC^[21]	SVM	82.50	80.00	85.00	—	—	—
HFCC^[6]	SVM	87.50	90.00	85.00	—	—	—
IMFCC^[22]	RF	92.34	88.67	90.00	82.89	80.66	86.47
IMFCC^[22]	SVM	94.74	88.24	100.00	81.36	79.66	82.46
卷积神经网络^[10]	—	99.82	—	—	—	—	—
VGG16混合模型^[11]	—	90.50	91.00	90.00	—	—	—
SFLG(ours)	SVM	97.81	97.88	97.02	92.93	94.13	91.84
SFLG(ours)	KNN	97.85	97.98	96.69	92.04	96.75	87.72

References 22

[1]	DUFFY J R. Motor speech disorders:substrates, differential diagnosis, and management[M]. Boston: Addison-Wesley, 2005.
[2]	LITTLE M A, MCSHARRY P E, ROBERTS S J, et al. Exploiting nonlinear recurrence and fractal scaling properties for voice disorder detection[J]. BioMedical Engineering OnLine, 2007, 6(1): 23. DOI URL
[3]	LITTLE M A, MC SHARRY P E, HUNTER E J, et al. Suitability of dysphonia measurements for telemonitoring of Parkinson’s disease[J]. IEEE Transactions on Biomedical Engineering, 2009, 56(4): 1015-1022. DOI URL
[4]	张涛, 洪文学, 常凤香, 等. 基于元音分类度的帕金森病语音特征分析[J]. 中国生物医学工程学报, 2011, 30(3): 476-480.
	ZHANG T, HONG W X, CHANG F X, et al. Speech features analysis of Parkinson’s disease by vowel class separability[J]. Chinese Journal of Biomedical Engineering, 2011, 30(3): 476-480.
[5]	SAKAR B E, ISENKUL M E, SAKAR C O, et al. Collec-tion and analysis of a Parkinson speech dataset with multiple types of sound recordings[J]. IEEE Journal of Biomedical and Health Informatics, 2013, 17(4): 828-834. DOI URL
[6]	BENBA A, JILBAB A, HAMMOUCH A. Using human factor cepstral coefficient on multiple types of voice recordings for detecting patients with Parkinson’s disease[J]. Innovation and Research in BioMedical Engineering, 2017, 38(6): 346-351.
[7]	KARAN B, MAHTO K, SAHU S S. Detection of Parkinson disease using variational mode decomposition of speech signal[C]// Proceedings of the 2018 International Conference on Communication and Signal Processing, Chennai, Apr 3-5, 2018. Piscataway: IEEE, 2018: 508-512.
[8]	张小恒, 王力锐, 曹垚, 等. 混合语音段特征双边式优选算法用于帕金森病分类研究[J]. 生物医学工程学杂志, 2017, 34(6): 942-948.
	ZHANG X H, WANG L R, CAO Y, et al. Combining speech sample and feature bilateral selection algorithm for classification of Parkinson’s disease[J]. Chinese Journal of Biomedical Engineering, 2017, 34(6): 942-948.
[9]	李勇明, 张成, 王品, 等. 面向帕金森病语音数据挖掘的分包融合集成算法[J]. 生物医学工程学杂志, 2019, 36(4): 548-556.
	LI Y M, ZHANG C, WANG P, et al. A partition bagging ensemble learning algorithm for Parkinson’s speech data mining[J]. Chinese Journal of Biomedical Engineering, 2019, 36(4): 548-556.
[10]	ZHANG T, ZHANG Y J, CAO Y Y, et al. Diagnosing Parkinson’s disease with speech signal based on convolutional neural network[J]. International Journal of Computer Applica- tions in Technology, 2020, 63(4): 348-353.
[11]	王娟, 徐志京. HR-DCGAN方法的帕金森声纹样本扩充及识别研究[J]. 小型微型计算机系统, 2019, 40(9): 2026-2032.
	WANG J, XU Z J. Study on augmentation and recognition of Parkinson’s voiceprint samples by HR-DCGAN method[J]. Journal of Chinese Computer Systems, 2019, 40(9): 2026- 2032.
[12]	AI-FATLAWI A H, JABARDI M H, LING S H. Efficient diagnosis system for Parkinson’s disease using deep belief network[C]// Proceedings of the 2016 IEEE Congress on Evo-lutionary Computation, Vancouver, Jul 24-29, 2016. Piscata-way: IEEE, 2016: 1324-1330.
[13]	KHAN T, WESTIN J, DOUGHERTY M. Cepstral separation difference: a novel approach for speech impairment quant-ification in Parkinson’s disease[J]. Biocybernetics & Biomedical Engineering, 2014, 34(1): 25-34.
[14]	OROZCO-ARROYAVE J R, HÖNIG F, ARIAS-LONDOÑO J D, et al. Automatic detection of Parkinson’s disease in running speech spoken in three different languages[J]. The Journal of the Acoustical Society of America, 2016, 139(1):481-500. DOI URL
[15]	NARANJO L, PÉREZ C J, MARTÍN J. Addressing voice recording replications for tracking Parkinson’s disease progression[J]. Medical & Biological Engineering & Computing, 2017, 55(3): 365-373.
[16]	NARANJO L, PÉREZ C J, MARTÍN J, et al. A two-stage variable selection and classification approach for Parkinson’s disease detection by using voice recording replications[J]. Computer Methods and Programs in Biomedicine, 2017, 142: 147-156. DOI URL
[17]	张涛, 蒋培培, 张亚娟, 等. 基于时频混合域局部统计的帕金森病语音障碍分析方法研究[J]. 生物医学工程学杂志, 2021, 38(1): 21-29.
	ZHANG T, JIANG P P, ZHANG Y J, et al. Parkinson’s disease diagnosis based on local statistics of speech signal in transformation domain[J]. Journal of Biomedical Engineering, 2021, 38(1): 21-29.
[18]	ZHANG T, ZHANG Y J, SUN H, et al. Parkinson disease detection using energy direction features based on EMD from voice signal[J]. Biocybernetics and Biomedical Eng-ineering, 2020, 41(1): 127-141.
[19]	张涛, 蒋培培, 李林, 等. 基于偏序拓扑图的帕金森病语音障碍分析方法[J]. 中国生物医学工程学报, 2019, 38(1): 62-72.
	ZHANG T, JIANG P P, LI L, et al. Dysphonic analysis of speech disorders in Parkinson’s disease based on partially ordered topological graph[J]. Chinese Journal of Biomedical Engineering, 2019, 38(1): 62-72.
[20]	LITTLE M A, VAROQUAUX G, SAEB S, et al. Using and understanding cross-validation strategies. Perspectives on Saeb et al[J]. GigaScience, 2017, 6(5): 1-6.
[21]	BENBA A, JILBAB A, HAMMOUCH A. Analysis of multiple types of voice recordings in cepstral domain using MFCC for discriminating between patients with Parkinson’s disease and healthy people[J]. International Journal of Speech Technology, 2016, 19(3): 449-456. DOI URL
[22]	KARAN B, SAHU S S, MAHTO K. Parkinson disease prediction using intrinsic mode function based features from speech signal[J]. Biocybernetics and Biomedical Engineering, 2020, 40(1): 249-264. DOI URL

Statistical Analysis of Local Gradient in Mel Transform Domain for Parkinson’s Dysphonia

帕金森语音障碍的Mel变换域局部梯度统计分析

RichHTML

PDF

Knowledge

Abstract

Cite this article

share this article

Figures/Tables 15

References 22

Related Articles 15

Recommended Articles

Metrics

[1]	XU Yangyang, WANG Yan. Research on Blockchain in Cloud Manufacturing Resource Allocation [J]. Journal of Frontiers of Computer Science and Technology, 2022, 16(10): 2298-2309.
[2]	WANG Feilong, LIU Ping, ZHANG Ling, LI Gang. Modified Algorithm of Capsule Network for Classifying Small Sample Image [J]. Journal of Frontiers of Computer Science and Technology, 2022, 16(10): 2387-2394.
[3]	LI Chunbiao, XIE Linbo, PENG Li. Salient Object Detection with Feature Hybrid Enhancement and Multi-loss Fusion [J]. Journal of Frontiers of Computer Science and Technology, 2022, 16(10): 2395-2404.
[4]	LI Xinchun, ZHAN Dechuan. Distributed Model Reuse with Multiple Classifiers [J]. Journal of Frontiers of Computer Science and Technology, 2022, 16(10): 2310-2319.
[5]	SHI Min, SHEN Jialin, YI Qingming, LUO Aiwen. Rapid and Ultra-lightweight Semantic Segmentation in Urban Traffic Scene [J]. Journal of Frontiers of Computer Science and Technology, 2022, 16(10): 2377-2386.
[6]	GENG Yaogang, MEI Hongyan, ZHANG Xing, LI Xiaohui. Review of Image Captioning Methods Based on Encoding-Decoding Technology [J]. Journal of Frontiers of Computer Science and Technology, 2022, 16(10): 2234-2248.
[7]	BAI Xiaobo, SHAO Jingfeng, WANG Tieshan, LI Bo. Fruit Fly Optimization Algorithm Based on Segmented Search and Resource Allo-cation to Textile Enterprise [J]. Journal of Frontiers of Computer Science and Technology, 2022, 16(10): 2330-2344.
[8]	WU Jing, XIE Hui, JIANG Huowen. Survey of Graph Neural Network in Recommendation System [J]. Journal of Frontiers of Computer Science and Technology, 2022, 16(10): 2249-2263.
[9]	LUO Haiyin, ZHENG Yuhui. Survey of Research on Image Inpainting Methods [J]. Journal of Frontiers of Computer Science and Technology, 2022, 16(10): 2193-2218.
[10]	ZHANG Xiangping, LIU Jianxun. Overview of Deep Learning-Based Code Representation and Its Applications [J]. Journal of Frontiers of Computer Science and Technology, 2022, 16(9): 2011-2029.
[11]	LI Dongmei, LUO Sisi, ZHANG Xiaoping, XU Fu. Review on Named Entity Recognition [J]. Journal of Frontiers of Computer Science and Technology, 2022, 16(9): 1954-1968.
[12]	YANG Caidong, LI Chengyang, LI Zhongbo, XIE Yongqiang, SUN Fangwei, QI Jin. Review of Image Super-resolution Reconstruction Algorithms Based on Deep Learning [J]. Journal of Frontiers of Computer Science and Technology, 2022, 16(9): 1990-2010.
[13]	LI Zhenqi, WANG Jing, JIA Ziyu, LIN Youfang. Attention-Based Multi-dimensional Feature Graph Convolutional Network for Motor Imagery Classification [J]. Journal of Frontiers of Computer Science and Technology, 2022, 16(9): 2050-2060.
[14]	YANG Jun, LEI Xiwen. Co-segmentation of 3D Point Cloud Shape Clusters Based on Weakly Supervised Learning [J]. Journal of Frontiers of Computer Science and Technology, 2022, 16(9): 2121-2131.
[15]	LYU Xiaoqi, JI Ke, CHEN Zhenxiang, SUN Runyuan, MA Kun, WU Jun, LI Yidong. Expert Recommendation Algorithm Combining Attention and Recurrent Neural Network [J]. Journal of Frontiers of Computer Science and Technology, 2022, 16(9): 2068-2077.