深度学习应用于时序预测研究综述

doi:10.3778/j.issn.1673-9418.2211108

摘要/Abstract

摘要： 时间序列一般是指对某种事物发展变化过程进行观测并按照一定频率采集得出的一组随机变量。时间序列预测的任务就是从众多数据中挖掘出其蕴含的核心规律并且依据已知的因素对未来的数据做出准确的估计。由于大量物联网数据采集设备的接入、多维数据的爆炸增长和对预测精度的要求愈发苛刻，经典的参数模型以及传统机器学习算法难以满足预测任务的高效率和高精度需求。近年来，以卷积神经网络、循环神经网络和Transformer模型为代表的深度学习算法在时间序列预测任务中取得了丰硕的成果。为进一步促进时间序列预测技术的发展，综述了时间序列数据的常见特性、数据集和模型的评价指标，并以时间和算法架构为研究主线，实验对比分析了各预测算法的特点、优势和局限；着重介绍对比了多个基于Transformer模型的时间序列预测方法；最后结合深度学习应用于时间序列预测任务存在的问题与挑战，对未来该方向的研究趋势进行了展望。

关键词: 时间序列数据, 时间序列预测, 深度学习, Transformer模型

Abstract: The time series is generally a set of random variables that are observed and collected at a certain frequency in the course of something??s development. The task of time series forecasting is to extract the core patterns from a large amount of data and to make accurate estimates of future data based on known factors. Due to the access of a large number of IoT data collection devices, the explosive growth of multidimensional data and the increasingly demanding requirements for prediction accuracy, it is difficult for classical parametric models and traditional machine learning algorithms to meet high efficiency and high accuracy requirements of prediction tasks. In recent years, deep learning algorithms represented by convolutional neural networks, recurrent neural networks and Trans-former models have achieved fruitful results in time series forecasting tasks. To further promote the development of time series prediction technology, common characteristics of time series data, evaluation indexes of datasets and models are reviewed, and the characteristics, advantages and limitations of each prediction algorithm are experimentally compared and analyzed with time and algorithm architecture as the main research line. Several time series prediction methods based on Transformer model are highlighted and compared. Finally, according to the problems and challenges of deep learning applied to time series prediction tasks, this paper provides an outlook on the future research trends in this direction.

Key words: time series data, time series prediction, deep learning, Transformer model

梁宏涛, 刘硕, 杜军威, 胡强, 于旭. 深度学习应用于时序预测研究综述[J]. 计算机科学与探索, 2023, 17(6): 1285-1300.

LIANG Hongtao, LIU Shuo, DU Junwei, HU Qiang, YU Xu. Review of Deep Learning Applied to Time Series Prediction[J]. Journal of Frontiers of Computer Science and Technology, 2023, 17(6): 1285-1300.

参考文献

[1] YUAN Y, LIN L. Self-supervised pretraining of transformers for satellite image time series classification[J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2021, 14: 474-487.
[2] ZERVEAS G, JAYARAMAN S, PATEL D, et al. A transformer-based framework for multivariate time series represen-tation learning[C]//Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Singapore, Aug 14-18, 2021. New York: ACM, 2021: 2114-2124.
[3] 张国豪, 刘波. 采用CNN和Bidirectional GRU的时间序列分类研究[J]. 计算机科学与探索, 2019, 13(6): 916-927.
ZHANG G H, LIU B. Research on time series classifica-tion using CNN and bidirectional GRU[J]. Journal of Fron-tiers of Computer Science and Technology, 2019, 13(6): 916-927.
[4] 张雅雯, 王志海, 刘海洋, 等. 基于多尺度残差FCN的时间序列分类算法[J]. 软件学报, 2022, 33(2): 555-570.
ZHANG Y W, WANG Z H, LIU H Y, et al. Time series classification algorithm based on multiscale residual full convolutional neural network[J]. Journal of Software, 2022, 33(2): 555-570.
[5] RUFF L, KAUFFMANN J R, VANDERMEULEN R A, et al. A unifying review of deep and shallow anomaly detec-tion[J]. Proceedings of the IEEE, 2021, 109(5): 756-795.
[6] MENG H, ZHANG Y, LI Y, et al. Spacecraft anomaly detec-tion via transformer reconstruction error[C]//Proceedings of the 2019 International Conference on Aerospace System Science and Engineering, Toronto, Jul 30-Aug 1, 2019. Cham: Springer, 2020: 351-362.
[7] CHEN Z, CHEN D, ZHANG X, et al. Learning graph struc-tures with transformer for multivariate time series anomaly detection in IoT[J]. IEEE Internet of Things Journal, 2022, 9(12): 9179-9189.
[8] SHCHUR O, TüRKMEN A C, JANUSCHOWSKI T, et al. Neural temporal point processes: a review[J]. arXiv:2104.03528, 2021.
[9] ZHANG Q, LIPANI A, KIRNAP O, et al. Self-attentive Hawkes process[C]//Proceedings of the 2020 International Conference on Machine Learning, Jul 13-18, 2020: 11183-11193.
[10] ZUO S, JIANG H, LI Z, et al. Transformer Hawkes process[C]//Proceedings of the 2020 International Conference on Machine Learning, Jul 13-18, 2020: 11692-11702.
[11] ESLING P, AGON C. Time-series data mining[J]. ACM Computing Surveys, 2012, 45(1): 1-34.
[12] LIM B, ZOHREN S. Time-series forecasting with deep lear-ning: a survey[J]. Philosophical Transactions of the Royal Society A, 2021, 379(2194): 20200209.
[13] TORRES J F, HADJOUT D, SEBAA A, et al. Deep learning for time series forecasting: a survey[J]. Big Data, 2021, 9(1): 3-21.
[14] 洪申达, 尹宁, 邱镇, 等. SPG-Suite:面向伪周期时间序列的预测方法[J]. 计算机科学与探索, 2014, 8(10): 1153-1161.
HONG S D, YIN N, QIU Z, et al. SPG-Suite: forecasting method towards pseudo periodic time series[J]. Journal of Frontiers of Computer Science and Technology, 2014, 8(10): 1153-1161.
[15] GAO J, SULTAN H, HU J, et al. Denoising nonlinear time series by adaptive filtering and wavelet shrinkage: a compa-rison[J]. IEEE Signal Processing Letters, 2009, 17(3): 237-240.
[16] ROJO-áLVAREZ J L, MARTíNEZ-RAMóN M, DE PRADO-CUMPLIDO M, et al. Support vector method for robust ARMA system identification[J]. IEEE Transactions on Signal Processing, 2004, 52(1): 155-164.
[17] 赵洪科, 吴李康, 李徵, 等. 基于深度神经网络结构的互联网金融市场动态预测[J]. 计算机研究与发展, 2019, 56(8): 1621-1631.
ZHAO H K, WU L K, LI Z, et al. Predicting the dynamics in Internet finance based on deep neural network structure[J]. Journal of Computer Research and Development, 2019, 56(8): 1621-1631.
[18] HONG T, FAN S. Probabilistic electric load forecasting: a tutorial review[J]. International Journal of Forecasting, 2016, 32(3): 914-938.
[19] SHAO H, SOONG B H. Traffic flow prediction with long short-term memory networks (LSTMs)[C]//Proceedings of the 2016 IEEE Region 10 Conference, Singapore, Nov 22-25, 2016. Piscataway: IEEE, 2017: 2986-2989.
[20] 王永恒, 高慧, 陈炫伶. 采用变结构动态贝叶斯网络的交通流量预测[J]. 计算机科学与探索, 2017, 11(4): 528-538.
WANG Y H, GAO H, CHEN X L. Traffic prediction method using structure varying dynamic Bayesian networks[J]. Journal of Frontiers of Computer Science and Technology, 2017, 11(4): 528-538.
[21] 郑月彬, 朱国魂. 基于Twitter数据的时间序列模型在流行性感冒预测中的应用[J]. 中国预防医学杂志, 2019, 20(9): 793-798.
ZHENG Y B, ZHU G H. Application of Twitter time series model in influenza prediction[J]. Chinese Preventive Medicine, 2019, 20(9): 793-798.
[22] 宋亚奇, 周国亮, 朱永利. 智能电网大数据处理技术现状与挑战[J]. 电网技术, 2013, 37(4): 927-935.
SONG Y Q, ZHOU G L, ZHU Y L. Present status and challenges of big data processing in smart grid[J]. Power System Technology, 2013, 37(4): 927-935.
[23] WEN Q, HE K, SUN L, et al. RobustPeriod: robust time-frequency mining for multiple periodicity detection[C]//Pro-ceedings of the 2021 International Conference on Manage-ment of Data, Jun 20-25, 2021. New York: ACM, 2021: 2328-2337.
[24] 李盼盼, 宋韶旭, 王建民. 时间序列对称模式挖掘[J]. 软件学报, 2022, 33(3): 968-984.
LI P P, SONG S X, WANG J M. Time series symmetric pattern mining[J]. Journal of Software, 2022, 33(3): 968-984.
[25] WILLMOTT C J, MATSUURA K. Advantages of the mean absolute error (MAE) over the root mean square error (RMSE) in assessing average model performance[J]. Climate Research, 2005, 30(1): 79-82.
[26] WANG Z, BOVIK A C. Mean squared error: love it or leave it? A new look at signal fidelity measures[J]. IEEE signal Processing Magazine, 2009, 26(1): 98-117.
[27] CHAI T, DRAXLER R R. Root mean square error (RMSE) or mean absolute error (MAE)?—arguments against avoiding RMSE in the literature[J]. Geoscientific Model Development, 2014, 7(3): 1247-1250.
[28] DE MYTTENAERE A, GOLDEN B, LE GRAND B, et al. Mean absolute percentage error for regression models[J]. Neurocomputing, 2016, 192: 38-48.
[29] CAMERON A C, WINDMEIJER F A G. An R-squared mea-sure of goodness of fit for some common nonlinear regres-sion models[J]. Journal of Econometrics, 1997, 77(2): 329-342.
[30] 万晨, 李文中, 丁望祥, 等. 一种基于自演化预训练的多变量时间序列预测算法[J]. 计算机学报, 2022, 45(3): 513-525.
WAN C, LI W Z, DING W X, et al. A multivariate time series forecasting algorithm based on self-evolutionary pre-training[J]. Chinese Journal of Computers, 2022, 45(3): 513-525.
[31] GOODFELLOW I, BENGIO Y, COURVILLE A, et al. Deep learning[M]. Cambridge: MIT Press, 2016: 326-366.
[32] GU J, WANG Z, KUEN J, et al. Recent advances in convo-lutional neural networks[J]. Pattern recognition, 2018, 77: 354-377.
[33] LI L Z, OTA K, DONG M X. Everything is image: CNN-based short-term electrical load forecasting for smart grid[C]//Proceedings of the 2017 14th International Symposium on Pervasive Systems, Algorithms and Networks & 2017 11th International Conference on Frontier of Computer Science and Technology & 2017 3rd International Symposium of Creative Computing, Exeter, Jun 21-23, 2017. Washington:IEEE Computer Society, 2017: 344-351.
[34] BOROVYKH A, BOHTE S, OOSTERLEE C W. Conditional time series forecasting with convolutional neural networks[J]. arXiv:1703.04691, 2017.
[35] DONG X S, QIAN L J, HUANG L. Short-term load fore-casting in smart grid: a combined CNN and K-means clus-tering approach[C]//Proceedings of the 2017 IEEE Interna-tional Conference on Big Data and Smart Computing, Jeju, Feb 13-16, 2017. Piscataway: IEEE, 2017: 119-125.
[36] BAI S J, KOLTER J Z, KOLTUN V. An empirical evaluation of generic convolutional and recurrent networks for sequence modeling[J]. arXiv:1803.01271, 2018.
[37] GOODFELLOW I, BENGIO Y, COURVILLE A. Deep learning[M]. Cambridge: MIT Press, 2016: 363-405.
[38] SCHUSTER M, PALIWAL K K. Bidirectional recurrent neural networks[J]. IEEE Transactions on Signal Processing, 1997, 45(11): 2673-2681.
[39] HOCHREITER S, SCHMIDHUBER J. Long short-term memory[J]. Neural Computation, 1997, 9(8): 1735-1780.
[40] YU Y, SI X, HU C, et al. A review of recurrent neural networks: LSTM cells and network architectures[J]. Neural Computation, 2019, 31(7): 1235-1270.
[41] 王鑫, 吴际, 刘超, 等. 基于LSTM循环神经网络的故障时间序列预测[J]. 北京航空航天大学学报, 2018, 44(4): 772-784.
WANG X, WU J, LIU C, et al. Fault time series prediction based on LSTM recurrent neural network[J]. Journal of Beijing University of Aeronautics and Astronautics, 2018, 44(4): 772-784.
[42] GRAVES A, SCHMIDHUBER J. Framewise phoneme classi-fication with bidirectional LSTM and other neural network architectures[J]. Neural Networks, 2005, 18(5/6): 602-610.
[43] SIAMI-NAMINI S, TAVAKOLI N, NAMIN A S. The per-formance of LSTM and BiLSTM in forecasting time series[C]//Proceedings of the 2019 IEEE International Conference on Big Data, Los Angeles, Dec 9-12, 2019. Piscataway: IEEE, 2019: 3285-3292.
[44] CHO K, VAN MERRI?NBOER B, GULCEHRE C, et al. Learning phrase representations using RNN encoder-decoder for statistical machine translation[J]. arXiv:1406.1078, 2014.
[45] FU R, ZHANG Z, LI L. Using LSTM and GRU neural network methods for traffic flow prediction[C]//Proceedings of the 2016 31st Youth Academic Annual Conference of Chinese Association of Automation, Wuhan, Nov 11-13, 2016. Pisca-taway: IEEE, 2016: 324-328.
[46] QI Y, LI C, DENG H, et al. A deep neural framework for sales forecasting in e-commerce[C]//Proceedings of the 28th ACM International Conference on Information and Knowle-dge Management, Beijing, Nov 3-7, 2019. New York: ACM, 2019: 299-308.
[47] XIN S, ESTER M, BU J, et al. Multi-task based sales pre-dictions for online promotions[C]//Proceedings of the 28th ACM International Conference on Information and Knowle-dge Management, Beijing, Nov 3-7, 2019. New York: ACM, 2019: 2823-2831.
[48] 王文冠, 沈建冰, 贾云得. 视觉注意力检测综述[J]. 软件学报, 2019, 30(2): 416-439.
WANG W G, SHEN J B, JIA Y D. Review of visual atten-tion detection[J]. Journal of Software, 2019, 30(2): 416-439.
[49] CHOROWSKI J K, BAHDANAU D, SERDYUK D, et al. Attention-based models for speech recognition[C]//Advances in Neural Information Processing Systems 28, Montreal,Dec 7-12, 2015: 577-585.
[50] VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]//Advances in Neural Information Proces-sing Systems 30, Long Beach, Dec 4-9, 2017: 5998-6008.
[51] LI S, JIN X, XUAN Y, et al. Enhancing the locality and breaking the memory bottleneck of transformer on time series forecasting[C]//Advances in Neural Information Processing Systems 32, Vancouver, Dec 8-14, 2019: 5243-5253.
[52] WEN Q, ZHOU T, ZHANG C, et al. Transformers in time series: a survey[J]. arXiv:2202.07125, 2022.
[53] DEVLIN J, CHANG M W, LEE K, et al. BERT: pre-training of deep bidirectional transformers for language understan-ding[J]. arXiv:1810.04805, 2018.
[54] JIN K H, WI J A, LEE E J, et al. TrafficBERT: pre-trained model with large-scale data for long-range traffic flow fore-casting[J]. Expert Systems with Applications, 2021, 186: 115738.
[55] WU S, XIAO X, DING Q, et al. Adversarial sparse transfor-mer for time series forecasting[C]//Advances in Neural In-formation Processing Systems 33, Dec 6-12, 2020: 17105-17115.
[56] CHILD R, GRAY S, RADFORD A, et al. Generating long sequences with sparse transformers[J]. arXiv:1904.10509, 2019.
[57] ZHOU H Y, ZHANG S H, PENG J Q, et al. Informer: beyond efficient transformer for long sequence time-series forecasting[C]//Proceedings of the 35th AAAI Conference on Artificial Intelligence, the 33rd Conference on Innovative Applications of Artificial Intelligence, the 11th Symposium on Educational Advances in Artificial Intelligence, Feb 2-9, 2021. Menlo Park: AAAI, 2021: 11106-11115.
[58] LIM B, ARIK S ?, LOEFF N, et al. Temporal fusion trans-formers for interpretable multi-horizon time series forecas-ting[J]. International Journal of Forecasting, 2021, 37(4): 1748-1764.
[59] LIN Y, KOPRINSKA I, RANA M. SSDNet: state space decomposition neural network for time series forecasting[C]//Proceedings of the 2021 IEEE International Conference on Data Mining, Auckland, Dec 7-10, 2021. Piscataway: IEEE, 2021: 370-378.
[60] SALINAS D, FLUNKERT V, GASTHAUS J, et al. DeepAR: probabilistic forecasting with autoregressive recurrent net-works[J]. International Journal of Forecasting, 2020, 36(3): 1181-1191.
[61] RANGAPURAM S S, SEEGER M W, GASTHAUS J, et al. Deep state space models for time series forecasting[C]// Advances in neural Information Processing Systems 31, Montréal, Dec 3-8, 2018: 7785-7794.
[62] ORESHKIN B N, CARPOV D, CHAPADOS N, et al. N-BEATS: neural basis expansion analysis for interpretable time series forecasting[J]. arXiv:1905.10437, 2019.
[63] VAGROPOULOS S I, CHOULIARAS G I, KARDAKOS E G, et al. Comparison of SARIMAX, SARIMA, modified SARIMA and ANN-based models for short-term PV gene-ration forecasting[C]//Proceedings of the 2016 IEEE Inter-national Energy Conference, Leuven, Apr 4-8, 2016. Pisca-taway: IEEE, 2016: 1-6.
[64] GIBRAN K, BUSHRUI S B. The prophet: a new annotated edition[M]. Simon and Schuster, 2012.
[65] WU H, XU J, WANG J, et al. Autoformer: decomposition transformers with auto-correlation for long-term series fore-casting[C]//Advances in Neural Information Processing Sys-tems 34, Dec 6-14, 2021: 22419-22430.
[66] QI X, HOU K, LIU T, et al. From known to unknown: know-ledge-guided transformer for time-series sales forecasting in Alibaba[J]. arXiv:2109.08381, 2021.
[67] ZHOU T, MA Z, WEN Q, et al. FEDformer: frequency en-hanced decomposed transformer for long-term series fore-casting[J]. arXiv:2201.12740, 2022.
[68] BRACEWELL R N, BRACEWELL R N. The Fourier transform and its applications[M]. New York: McGraw-Hill, 1986.
[69] ZHANG D S. Wavelet transform[M]//Fundamentals of Image Data Mining. Berlin, Heidelberg: Springer, 2019: 35-44.
[70] WEN Q S, GAO J K, SONG X M, et al. RobustSTL: a robust seasonal-trend decomposition algorithm for long time series[C]//Proceedings of the 33rd AAAI Conference on Artificial Intelligence, the 31st Innovative Applications of Artificial Intelligence Conference, the 9th AAAI Symposium on Edu-cational Advances in Artificial Intelligence, Honolulu, Jan 27-Feb 1, 2019. Menlo Park: AAAI, 2019: 5409-5416.
[71] WANG Z. Fast algorithms for the discrete W transform and for the discrete Fourier transform[J]. IEEE Transactions on Acoustics, Speech, and Signal Processing, 1984, 32(4): 803-816.
[72] SHENSA M J. The discrete wavelet transform: wedding the a trous and Mallat algorithms[J]. IEEE Transactions on Signal Processing, 1992, 40(10): 2464-2482.
[73] LIU S, YU H, LIAO C, et al. Pyraformer: low-complexity pyramidal attention for long-range time series modeling and forecasting[C]//Proceedings of the 2022 International Con-ference on Learning Representations, Apr 25-29, 2022: 1-20.
[74] LI Y, LU X, XIONG H, et al. Towards long-term time-series forecasting: feature, pattern, and distribution[J]. arXiv:2301.02068, 2023.