Survey on Sequence Data Augmentation

doi:10.3778/j.issn.1673-9418.2012062

Abstract

Abstract:

To pursue higher accuracy, the structure of deep learning model is getting more and more complex, with deeper and deeper network. The increase in the number of parameters means that more data are needed to train the model. However, manually labeling data is costly, and it is not easy to collect data in some specific fields limited by objective reasons. As a result, data insufficiency is a very common problem. Data augmentation is here to alleviate the problem by artificially generating new data. The success of data augmentation in the field of computer vision leads people to consider using similar methods on sequence data. In this paper, not only the time-domain methods such as flipping and cropping but also some augmentation methods in frequency domain are described. In addition to experience-based or knowledge-based methods, detailed descriptions on machine learning models used for automatic data generation such as GAN are also included. Methods that have been widely applied to various sequence data such as text, audio and time series are mentioned with their satisfactory performance in issues like medical diagnosis and emotion classification. Despite the difference in data type, these methods are designed with similar ideas. Using these ideas as a clue, various data augmentation methods applied to different types of sequence data are introduced, and some discussions and prospects are made.

Key words: sequence data, data augmentation, deep learning

摘要：

为了追求精度，深度学习模型框架的结构越来越复杂，网络越来越深。参数量的增加意味着训练模型需要更多的数据。然而人工标注数据的成本是高昂的，且受客观原因所限，实际应用时可能难以获得特定领域的数据，数据不足问题非常常见。数据增强通过人为地生成新的数据增加数据量来缓解这一问题。数据增强方法在计算机视觉领域大放异彩，让人们开始关注类似方法能否应用在序列数据上。除了翻转、裁剪等在时间域进行增强的方法外，也描述了在频率域实现数据增强的方法；除了人们基于经验或知识而设计的方法以外，对一系列基于GAN的通过机器学习模型自动生成数据的方法也进行了详细的论述。介绍了应用在自然语言文本、音频信号和时间序列等多种序列数据上的数据增强方法，亦有涉及它们在医疗诊断、情绪判断等问题上的表现。尽管数据类型不同，但总结了应用在这些类型上的数据增强方法背后的相似的设计思路。以这一思路为线索，梳理应用在各类序列数据类型上的多种数据增强方法，并进行了一定的讨论和展望。

关键词: 序列数据, 数据增强, 深度学习

GE Yizhou, XU Xiang, YANG Suorong, ZHOU Qing, SHEN Furao. Survey on Sequence Data Augmentation[J]. Journal of Frontiers of Computer Science and Technology, 2021, 15(7): 1207-1219.

葛轶洲, 许翔, 杨锁荣, 周青, 申富饶. 序列数据的数据增强方法综述[J]. 计算机科学与探索, 2021, 15(7): 1207-1219.

References

[1] HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition[C]//Proceedings of the 2016 IEEE Con-ference on Computer Vision and Pattern Recognition, Las Vegas, Jun 26-Jul 1, 2016. Washington: IEEE Computer Society, 2016: 770-778.
[2] BROWN T B, KAPLAN J, MANN B, et al. Language models are few-shot learners[J]. arXiv:2005.14165, 2020.
[3] PASZKE A, GROSS S, MASSA F, et al. PyTorch: an impera-tive style, high-performance deep learning library[C]//Pro-ceedings of the Annual Conference on Neural Information Processing System, Vancouver, Dec 8-14, 2019: 8026-8037.
[4] IWANA B K, UCHIDA S. An empirical survey of data aug-mentation for time series classification with neural networks[J]. arXiv:2007.15951, 2020.
[5] ZHU K F, WANG J G, LIU Y J. Radar target recognition algorithm based on data augmentation and WACGAN with a limited training data[J]. Acta Electronica Sinica, 2020, 48(6): 1124-1131.
朱克凡, 王杰贵, 刘有军. 小样本条件下基于数据增强和WACGAN的雷达目标识别算法[J]. 电子学报, 2020, 48(6): 1124-1131.
[6] WEN Q S, SUN L, SONG X M, et al. Time series data aug-mentation for deep learning: a survey[J]. arXiv:2002.12478, 2020.
[7] WEI J, ZOU K. EDA: easy data augmentation techniques for boosting performance on text classification tasks[J]. arXiv:1901.11196, 2019.
[8] ZHOU Y Z, ZHA X Y, LAN J, et al. Transient stability pre-diction of power systems based on deep residual network and data augmentation[J]. Electric Power, 2020, 53(1): 22-31.
周艳真, 查显煜, 兰健, 等. 基于数据增强和深度残差网络的电力系统暂态稳定预测[J]. 中国电力, 2020, 53(1): 22-31.
[9] GAO J K, SONG X M, WEN Q S, et al. RobustTAD: robust time series anomaly detection via decomposition and convo-lutional neural networks[J]. arXiv:2002.09545, 2020.
[10] MCFEE B, HUMPHREY E J, BELLO J P. A software frame-work for musical data augmentation[C]//Proceedings of the 16th International Society for Music Information Retrieval Conference, Málaga, Oct 26-30, 2015. International Society for Music Information Retrieval, 2015: 248-254.
[11] KOBAYASHI S. Contextual augmentation: data augmentation by words with paradigmatic relations[J]. arXiv:1805.06201, 2018.
[12] STEVEN EYOBU O, HAN D S. Feature representation and data augmentation for human activity classification based on wearable IMU sensor data using a deep LSTM neural network[J]. Sensors, 2018, 18(9): 2892.
[13] THEILER J, EUBANK S, LONGTIN A, et al. Testing for nonlinearity in time series: the method of surrogate data[J]. Physica D: Nonlinear Phenomena, 1992, 51(8): 77-94.
[14] SCHREIBER T, SCHMITZ A. Improved surrogate data for nonlinearity tests[J]. Physical Review Letters, 1996, 77(4): 635.
[15] LEE T K M, KUAH E Y L, LEO K H, et al. Surrogate rehabilitative time series data for image-based deep learning[C]//Proceedings of the 27th European Signal Processing Conference, A Coru?a, Sep 2-6, 2019. Piscataway: IEEE, 2019: 1-5.
[16] PARK D S, CHAN W, ZHANG Y, et al. SpecAugment: a simple data augmentation method for automatic speech recognition[J]. arXiv:1904.08779, 2019.
[17] CLEVELAND R B, CLEVELAND W S, MCRAE J E, et al. STL: a seasonal-trend decomposition procedure based on Loess[J]. Journal of Official Statistics, 1990, 6(1): 3-33.
[18] KEGEL L, HAHMANN M, LEHNER W. Feature-based comparison and generation of time series[C]//Proceedings of the 30th International Conference on Scientific and Statistical Database Management, Bozen-Bolzano, Jul 9-11, 2018. New York: ACM, 2018: 1-12.
[19] BERGMEIR C, HYNDMAN ROB?J, BENíTEZ JOSé?M. Bagging exponential smoothing methods using STL decom-position and Box-Cox transformation[J]. International Journal of Forecasting, 2016, 32(2): 303-312.
[20] MAKRIDAKIS S, HIBON M. The M3-competition: results, conclusions and implications[J]. International Journal of Fore-casting, 2000, 16(4): 451-476.
[21] KHARITONOV E, RIVIRE M, SYNNAEVE G, et al. Data augmenting contrastive learning of speech representations in the time domain[J]. arXiv:2007.00991, 2020.
[22] LAPTEV N, AMIZADEH S, FLINT I. Generic and scalable framework for automated time-series anomaly detection[C]//Proceedings of the 21st ACM SIGKDD International Con-ference on Knowledge Discovery and Data Mining, Sydney, Aug 10-13, 2015. New York: ACM, 2015: 1939-1947.
[23] GOODFELLOW I J, POUGET-ABADIE J, MIRZA M, et al. Generative adversarial nets[C]//Proceedings of the 27th Inter-national Conference on Neural Information Processing Sys-tems, Montreal, Dec 8-13, 2014. New York: ACM, 2014: 2672-2680.
[24] RADFORD A, METZ L, CHINTALA S. Unsupervised repre-sentation learning with deep convolutional generative adver-sarial networks[J]. arXiv:1511.06434, 2015.
[25] ZHU J Y, PARK T, ISOLA P, et al. Unpaired image-to-image translation using cycle-consistent adversarial networks[C]//Proceedings of the 2017 IEEE International Conference on Computer Vision, Venice, Oct 22-29, 2017. Piscataway: IEEE, 2017: 2223-2232.
[26] ZHANG A, LIPTON Z C, LI M, et al. Dive into deep learning[EB/OL]. [2020-11-10]. https://d2l.ai/.
[27] DONAHUE J, KR?HENBüHL P, DARRELL T. Adversarial feature learning[J]. arXiv:1605.09782, 2016.
[28] SHORTEN C, KHOSHGOFTAAR T M. A survey on image data augmentation for deep learning[J]. Journal of Big Data, 2019, 6: 60.
[29] MIRZA M, OSINDERO S. Conditional generative adversarial nets[J]. arXiv:1411.1784, 2014.
[30] SUN X, DING X L. Data augmentation method based on generative adversarial networks for facial expression reco-gnition sets[J]. Computer Engineering and Applications, 2020, 56(4): 115-121.
孙晓, 丁小龙. 基于生成对抗网络的人脸表情数据增强方法[J]. 计算机工程与应用, 2020, 56(4): 115-121.
[31] RAMPONI G, PROTOPAPAS P, BRAMBILLA M, et al. T-CGAN: conditional generative adversarial network for data augmentation in noisy time series with irregular sampling[J]. arXiv:1811.08295, 2018.
[32] CHE Z P, CHENG Y, ZHAI S F, et al. Boosting deep learning risk prediction with generative adversarial networks for elec-tronic health records[C]//Proceedings of the 2017 IEEE Inter-national Conference on Data Mining, New Orleans, Nov 18-21, 2017. Washington: IEEE Computer Society, 2017: 787-792.
[33] ZHU F, YE F, FU Y C, et al. Electrocardiogram generation with a bidirectional LSTM-CNN generative adversarial net-work[J]. Scientific Reports, 2019, 9(1): 6734.
[34] DONAHUE C, MCAULEY J, PUCKETTE M. Adversarial audio synthesis[J]. arXiv:1802.04208, 2018.
[35] DONAHUE D, RUMSHISKY A. Adversarial text generation without reinforcement learning[J]. arXiv:1810.06640, 2018.
[36] HYLAND S L, ESTEBA C, R?TSCH G. Real-valued (medical) time series generation with recurrent conditional GANs[J]. arXiv:1706.02633, 2017.
[37] YOON J, JARRETT D, VAN DER SCHAAR M. Time-series generative adversarial networks[C]//Proceedings of the Annual Conference on Neural Information Processing Sys-tems, Vancouver, Dec 8-14, 2019: 5509-5519.
[38] NIKOLAIDIS K, KRISTIANSEN S, GOEBEL V, et al. Augmenting physiological time series data: a case study for sleep apnea detection[C]//LNCS 11908: Proceedings of the Joint European Conference on Machine Learning and Know-ledge Discovery in Databases, Würzburg, Sep 16-20, 2019. Berlin, Heidelberg: Springer, 2019: 376-399.
[39] WEI X, LI J, SUN X, et al. Cross-view image generation via mixture generative adversarial network[J/OL]. Acta Auto-matica Sinica[2021-01-24]. https://doi.org/10.16383/j.aas.c190743.
卫星, 李佳, 孙晓, 等. 基于混合生成对抗网络的多视角图像生成算法[J/OL]. 自动化学报[2021-01-24]. https://doi. org/10.16383/j.aas.c190743.
[40] MOGREN O. C-RNN-GAN: continuous recurrent neural networks with adversarial training[J]. arXiv:1611.09904, 2016.
[41] LEE S, HWANG U, MIN S, et al. Polyphonic music genera-tion with sequence generative adversarial networks[J]. arXiv:1710.11418, 2017.
[42] ZHANG H, XIAO N N, LIU P S, et al. G-RNN-GAN for singing voice separation[C]//Proceedings of the 5th Inter-national Conference on Multimedia Systems and Signal Processing, Chengdu, May 28-30, 2020. New York: ACM, 2020: 69-73.
[43] HUANG H X, WU R J, HUANG J B, et al. DCCRGAN: deep complex convolution recurrent generator adversarial network for speech enhancement[J]. arXiv:2012.10732, 2020.
[44] LU S Q, DOU Z C, JUN X, et al. PSGAN: a minimax game for personalized search with limited and noisy click data[C]//Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, Paris, Jul 21-25, 2019. New York: ACM, 2019: 555-564.
[45] KUSNER M J, HERNáNDEZ-LOBATO J M. GANs for sequences of discrete elements with the Gumbel-softmax distribution[J]. arXiv:1611.04051, 2016.
[46] WANG G T, LI W Q, AERTSEN M, et al. Test-time augmentation with uncertainty estimation for deep learning-based medical image segmentation[J]. arXiv:1807.07356, 2018.
[47] YAO Q H, WANG R X, FAN X M, et al. Multi-class arr-hythmia detection from 12-lead varied-length ECG using attention-based time-incremental convolutional neural net-work[J]. Information Fusion, 2020, 53: 174-182.
[48] XIE L X, WANG J D, WEI Z, et al. DisturbLabel: regula-rizing CNN on the loss layer[C]//Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recog-nition, Las Vegas, Jun 26-30, 2016. Washington: IEEE Computer Society, 2016: 4753-4762.
[49] ZHENG Q H, ZHAO P H, LI Y, et al. Spectrum interference-based two-level data augmentation method in deep learning for automatic modulation classification[J]. Neural Computing and Applications, 2020: 1-23.
[50] ZHU X Y, LIU Y F, LI J H, et al. Emotion classification with data augmentation using generative adversarial networks[C]//LNCS 10939: Proceedings of the 22nd Pacific-Asia Conference on Knowledge Discovery and Data Mining, Mel-bourne, Jun 3-6, 2018. Berlin, Heidelberg: Springer, 2018: 349-360.
[51] MAKHZANI A, SHLENS J, JAITLY N, et al. Adversarial autoencoders[J]. arXiv:1511.05644, 2015.
[52] LIM S K, LOO Y, TRAN N T, et al. DOPING: generative data augmentation for unsupervised anomaly detection with GAN[C]//Proceedings of the 2018 IEEE International Con-ference on Data Mining, Singapore, Nov 17-20, 2018. Washington: IEEE Computer Society, 2018: 1122-1127.
[53] SHENG P Y, YANG Z L, QIAN Y M. GANs for children: a generative data augmentation strategy for children speech recognition[C]//Proceedings of the 2019 IEEE Automatic Speech Recognition and Understanding Workshop, Singapore, Dec 14-18, 2019. Piscataway: IEEE, 2019: 129-135.
[54] TAYLOR L, NITSCHKE G. Improving deep learning with generic data augmentation[C]//Proceedings of the 2018 IEEE Symposium Series on Computational Intelligence, Bangalore, Nov 18-21, 2018. Piscataway: IEEE, 2018: 1542-1547.
[55] RASHID K M, LOUIS J. Times-series data augmentation and deep learning for construction equipment activity reco-gnition[J]. Advanced Engineering Informatics, 2019, 42: 100944.
[56] UM T T, PFISTER F M J, PICHLER D, et al. Data aug-mentation of wearable sensor data for Parkinson??s disease monitoring using convolutional neural networks[C]//Procee-dings of the 19th ACM International Conference on Multi-modal Interaction, Glasgow, Nov 13-17, 2017. New York: ACM, 2017: 216-220.
[57] HATAMIAN F N, RAVIKUMAR N, VESAL S, et al. The effect of data augmentation on classification of atrial fibrilla-tion in short single-lead ECG signals using deep neural networks[C]//Proceedings of the 2020 IEEE International Conference on Acoustics, Speech and Signal Processing, Barcelona, May 4-8, 2020. Piscataway: IEEE, 2020: 1264-1268.
[58] CUBUK E D, ZOPH B, MANé D, et al. AutoAugment: learning augmentation strategies from data[C]//Proceedings of the 2019 IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, Jun 16-20, 2019. Piscataway: IEEE, 2019: 113-123.
[59] MINH T N, SINN M, LAM H T, et al. Automated image data preprocessing with deep reinforcement learning[J]. arXiv:1806.05886, 2018.
[60] HU Z T, TAN B W, SALAKHUTDINOV R, et al. Learning data manipulation for augmentation and weighting[C]//Pro-ceedings of the Annual Conference on Neural Information Processing Systems, Vancouver, Dec 8-14, 2019: 15738-15749.
[61] WU Q Y, LI L, YU Z. TextGAIL: generative adversarial imita-tion learning for text generation[J]. arXiv:2004.13796, 2020.
[62] CHEN J Y, WU Y Y, JIA C Y, et al. Customizable text generation via conditional text generative adversarial net-work[J]. Neurocomputing, 2020, 416: 125-135.
[63] PAL M, KUMAR M, PERI R, et al. Meta-learning with latent space clustering in generative adversarial network for speaker diarization[J]. arXiv:2007.09635, 2020.