计算机科学与探索 ›› 2022, Vol. 16 ›› Issue (7): 1552-1560.DOI: 10.3778/j.issn.1673-9418.2101031

• 服务计算 • 上一篇    下一篇

小样本负载序列的结构化预测方法

刘春红1,2,+(), 张志华1, 焦洁1, 程渤3   

  1. 1.河南师范大学 计算机与信息工程学院,河南 新乡 453007
    2.智慧商务与物联网技术河南省工程实验室,河南 新乡 453007
    3.北京邮电大学 网络与交换技术国家重点实验室,北京 100876
  • 收稿日期:2020-12-02 修回日期:2021-01-29 出版日期:2022-07-01 发布日期:2021-02-05
  • 作者简介:刘春红(1969—),女,博士,副教授,主要研究方向为云计算、机器学习、服务计算。
    LIU Chunhong, born in 1969, Ph.D., associate professor. Her research interests include cloud com-puting, machine learning and service computing.
    张志华(1994—),女,硕士,主要研究方向为云计算、时间序列预测。
    ZHANG Zhihua, born in 1994, M.S. Her research interests include cloud computing and time series prediction.
    焦洁(1996—),女,硕士研究生,CCF学生会员,主要研究方向为云计算、机器学习。
    JIAO Jie, born in 1996, M.S. candidate, student member of CCF. Her research interests include cloud computing and machine learning.
    程渤(1975—),男,博士,教授,主要研究方向为服务计算、移动互联网、云计算。
    CHENG Bo, born in 1975, Ph.D., professor. His research interests include service compu-ting, mobile Internet and cloud computing.
  • 基金资助:
    河南省重点研发与推广专项(科技攻关)项目(202102210163);河南省重点研发与推广专项(科技攻关)项目(202102210152);河南省高等教育教学改革研究与实践项目(2019SJGLX033Y);网络与交换技术国家重点实验室(北京邮电大学)开放课题资助项目(SKLNST-2020-1-02)

Structured Prediction Method for Small Sample Workload Sequences

LIU Chunhong1,2,+(), ZHANG Zhihua1, JIAO Jie1, CHENG Bo3   

  1. 1. College of Computer and Information Engineering, Henan Normal University, Xinxiang, Henan 453007, China
    2. Engineering Lab of Intelligence Business & Internet of Things, Xinxiang, Henan 453007, China
    3. State Key Laboratory of Networking and Switching Technology, Beijing University of Posts and Telecommunications, Beijing 100876, China
  • Received:2020-12-02 Revised:2021-01-29 Online:2022-07-01 Published:2021-02-05
  • Supported by:
    the Key Research and Development and Promotion Special (Technology Research) Project of Henan Province(202102210163);the Key Research and Development and Promotion Special (Technology Research) Project of Henan Province(202102210152);the Research and Practice Project of Higher Education Teaching Reform in Henan Province(2019SJGLX033Y);the Open Foundation of State Key Laboratory of Networking and Switching Technology (Beijing University of Posts and Telecommunications)(SKLNST-2020-1-02)

摘要:

准确的负载预测是实现云平台弹性资源管理的关键。针对云负载预测中存在大量运行周期较短的任务,导致预测模型训练数据不足的问题,提出一种利用多变量负载序列结构化信息的预测方法(SP-MWS)。依据同一任务运行中消耗的多种资源间内在的相关性,挖掘多维负载序列间的信息,补充小样本序列的预测信息量。首先,为获取相关性强的负载类型,采用最大互信息系数(MIC)和信息熵进行负载类型的度量选择;然后,构建核范数正则化多任务学习模型(TNR-MTL),将相关负载序列同时输入,实现其结构化信息的挖掘,并完成多种负载的同时预测。在Google云平台的运行监控日志数据集上进行验证,实验结果表明,所提方法获得的相关负载序列类型可明显增加模型信息量。对于预测模型的决策依据进行解释性分析,可视化每种变量对预测结果的贡献度;对比实验表明,所提预测方法在时间性能和预测精度上均优于常用的负载预测方法。

关键词: 云计算, 弹性, 负载预测, 多变量预测, 结构化信息

Abstract:

Accurate workload prediction is the key to realize elastic resource management of cloud platform. Aiming at the problem that a large number of tasks with short running time achieve prediction in the cloud platform, which leads to the lack of training data of the forecasting model, a structured prediction of multivariable workload sequences (SP-MWS) method is proposed. It is based on the characteristics of intrinsic correlation among multiple resources consumed in the running of a single task, and the relationship of multi-dimensional workload sequences is explored to supplement the prediction information of small-scale sequence. Firstly, in order to obtain the related workload types, the maximum information coefficient (MIC) and information entropy are adopted to measure the correlation, and related workload types are selected. Secondly, for the selected multiple related workloads, trace-norm regularization multi-task learning (TNR-MTL) is introduced to construct prediction model to realize structural information mining of related workload sequences and complete prediction of multiple workloads simultaneously. Validated on Google cloud platform’s operational monitoring log dataset, the experimental results show that the proposed method can significantly increase model information; the decision-making basis of the prediction model is interpreted and the contribution of each variable to the prediction model is visualized. Comparative experiments show that, the proposed prediction method is better than the commonly used workload prediction methods in time performance and prediction accuracy.

Key words: cloud computing, elasticity, workload prediction, multivariable prediction, structural information

中图分类号: