Journal of Frontiers of Computer Science and Technology ›› 2024, Vol. 18 ›› Issue (10): 2551-2572.DOI: 10.3778/j.issn.1673-9418.2312034
• Frontiers·Surveys • Previous Articles Next Articles
HU Cheng, CHEN Shihong
Online:
2024-10-01
Published:
2024-09-29
胡程,陈仕鸿
HU Cheng, CHEN Shihong. Survey of Adaptive Elastic Scaling Studies on Distributed Service Resources[J]. Journal of Frontiers of Computer Science and Technology, 2024, 18(10): 2551-2572.
胡程, 陈仕鸿. 分布式服务资源自适应弹性伸缩研究综述[J]. 计算机科学与探索, 2024, 18(10): 2551-2572.
Add to citation manager EndNote|Ris|BibTeX
URL: http://fcst.ceaj.org/EN/10.3778/j.issn.1673-9418.2312034
[1] 田雨萌, 刘志波, 张凯, 等. 云边资源协同中的任务卸载技术综述[J]. 计算机科学与探索, 2023, 17(10): 2325-2342. TIAN Y M, LIU Z B, ZHANG K, et al. Survey of task offloading technology in cloud-edge resource collaboration[J]. Journal of Frontiers of Computer Science and Technology, 2023, 17(10): 2325-2342. [2] BALLIU A, OLIVETTI D, BABAOGLU O, et al. A big data analyzer for large trace logs[J]. Computing, 2016, 98(12): 1225-1249. [3] 吴虹佳, 刘芳, 刘斌, 等. 分散计算:技术、应用与挑战[J]. 计算机科学与探索, 2020, 14(5): 721-730. WU H J, LIU F, LIU B, et al. Dispersed computing: technologies, applications and challenges[J]. Journal of Frontiers of Computer Science and Technology, 2020, 14(5): 721-730. [4] ATMACA T, BEGIN T, BRANDWAJN A, et al. Performance evaluation of cloud computing centers with general arrivals and service[J]. IEEE Transactions on Parallel and Distributed Systems, 2016, 27(8): 2341-2348. [5] 郭军, 武静, 邢留冬, 等. 面向突发业务的云服务并发量应对策略研究[J]. 计算机学报, 2019, 42(4): 190-208. GUO J, WU J, XING L D, et al. A coping strategy for bursty workload of cloud service[J]. Chinese Journal of Computers, 2019, 42(4): 190-208. [6] LIN W, XU S, HE L, et al. Multi-resource scheduling and power simulation for cloud computing[J]. Information Sciences, 2017, 397/398: 168-186. [7] ENTEZARI-MALEKI R, SOUSA L, MOVAGHAR A. Performance and power modeling and evaluation of virtualized servers in IaaS clouds[J]. Information Sciences, 2017, 394/395: 106-122. [8] ZHONG Z H, XU M X, RODRIGUEZ M A, et al. Machine learning-based orchestration of containers: a taxonomy and future directions[J]. ACM Computing Surveys, 2022, 54(10s): 1-35. [9] WU Y, MIN G, LI K, et al. Modeling and analysis of communication networks in multicluster systems under spatio-temporal bursty traffic[J]. IEEE Transactions on Parallel and Distributed Systems, 2012, 23(5): 902-912. [10] HU C, DENG Y, MIN G, et al. QoS promotion in energy-efficient datacenters through peak load scheduling[J]. IEEE Transactions on Cloud Computing, 2021, 9(2): 777-792. [11] 陈斌, 白晓颖, 马博, 等. 分布式系统可伸缩性研究综述[J]. 计算机科学, 2011, 38(8): 17-24. CHEN B, BAI X Y, MA B, et al. Survey on software scalability of distributed systems[J]. Computer Science, 2011, 38(8): 17-24. [12] 王晶, 方伟, 陈静怡, 等. 云计算环境下的自适应资源管理技术综述[J]. 计算机工程与设计, 2012, 33(6): 2127-2132. WANG J, FANG W, CHEN J Y, et al. Survey on adaptive resource management techniques in cloud computing environment[J]. Computer Engineering and Design, 2012, 33(6): 2127-2132. [13] 钱琼芬, 李春林, 张小庆, 等. 云数据中心虚拟资源管理研究综述[J]. 计算机应用研究, 2012, 29(7): 2411-2415. QIAN Q F, LI C L, ZHANG X Q, et al. Survey of virtual resource management in cloud data center[J]. Application Research of Computers, 2012, 29(7): 2411-2415. [14] 唐续豪, 刘发贵, 王彬, 等. 跨云环境下任务调度综述[J]. 计算机研究与发展, 2023, 60(6): 1262-1275. TANG X H, LIU F G, WANG B, et al. Survey on task scheduling in inter-cloud environment[J]. Journal of Computer Research and Development, 2023, 60(6): 1262-1275. [15] 陈红华, 崔翛龙, 王耀杰. 基于多种云环境的任务调度算法综述[J]. 计算机应用研究, 2023, 40(10): 2889-2895. CHEN H H, CUI X L, WANG Y J. Summary of task scheduling algorithms based on multiple cloud environments[J]. Application Research of Computers, 2023, 40(10): 2889-2895. [16] 王凌, 吴楚格, 范文慧. 边缘计算资源分配与任务调度优化综述[J]. 系统仿真学报, 2021, 33(3): 509-520. WANG L, WU C G, FAN W H. A survey of edge computing resource allocation and task scheduling optimization[J]. Journal of System Simulation, 2021, 33(3): 509-520. [17] HU C, DENG Y. Aggregating correlated cold data to minimize the performance degradation and power consumption of cold storage nodes[J]. The Journal of Supercomputing, 2019, 75(2): 662-687. [18] YU J, KIM J, SEO E. Know your enemy to save cloud energy: energy-performance characterization of machine learning serving[C]//Proceedings of the 29th IEEE International Symposium on High-Performance Computer Architecture, Montreal, Feb 25-Mar 1, 2023. Piscataway: IEEE, 2023: 842-854. [20] KALBASI A, KRISHNAMURTHY D, ROLIA J, et al. MODE: mix driven on-line resource demand estimation[C]//Proceedings of the 7th International Conference on Network and Service Management, Paris, Oct 24-28, 2011. Piscataway: IEEE, 2011: 1-9. [20] JORDAN M G, KOROL G, KNORST T, et al. Energy-aware fully-adaptive resource provisioning in collaborative CPU-FPGA cloud environments[J]. Journal of Parallel and Distributed Computing, 2023, 176: 55-69. [21] BRATEK P, SZUSTAK L, WYRZYKOWSKI R, et al. Reducing energy consumption using heterogeneous voltage frequency scaling of data-parallel applications for multicore systems[J]. Journal of Parallel and Distributed Computing, 2023, 175: 121-133. [22] 刘伟, 尹行, 段玉光, 等. 同构DVS集群中基于自适应阈值的并行任务节能调度算法[J]. 计算机学报, 2013, 36(2): 393-407. LIU W, YIN H, DUAN Y G, et al. Adaptive threshold-based energy-efficient scheduling algorithm for parallel tasks on homogeneous DVS-enabled clusters[J]. Chinese Journal of Computers, 2013, 36(2): 393-407. [23] PéREZ J F, PACHECO-SANCHEZ S, CASALE G. An offline demand estimation method for multi-threaded applications[C]//Proceedings of the 2013 IEEE 21st International Symposium on Modelling, Analysis and Simulation of Computer and Telecommunication Systems, San Francisco, Aug 14-16, 2013. Piscataway: IEEE, 2013: 21-30. [24] PéREZ J F, CASALE G, PACHECO-SANCHEZ S. Estimating computational requirements in multi-threaded applications[J]. IEEE Transactions on Software Engineering, 2015, 41(3): 264-278. [25] CHENG D, RAO J, JIANG C, et al. Elastic power-aware resource provisioning of heterogeneous workloads in self-sustainable datacenters[J]. IEEE Transactions on Computers, 2016, 65(2): 508-521. [26] 赵小刚, 胡启平, 丁玲, 等. 基于模型预测控制的数据中心节能调度算法[J]. 软件学报, 2017, 28(2): 429-442. ZHAO X G, HU Q P, DING L, et al. Energy saving scheduling strategy based on model prediction control for data centers[J]. Journal of Software, 2017, 28(2): 429-442. [27] ZHAO J, UWIZEYIMANA I, GANESAN K, et al. ALTOCUMULUS: scalable scheduling for nanosecond-scale remote procedure calls[C]//Proceedings of the 55th IEEE/ACM International Symposium on Microarchitecture, Chicago, Oct 1-5, 2022. Piscataway: IEEE, 2022: 423-440. [28] WANG J, LI X, RUIZ R, et al. Energy utilization task scheduling for mapreduce in heterogeneous clusters[J]. IEEE Transactions on Services Computing, 2020, 15(2): 931-944. [29] CARVER B, HAN R, ZHANG J, et al. λFS: a scalable and elastic distributed file system metadata service using serverless functions[C]//Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Vancouver, Mar 25-29, 2023. New York: ACM, 2023: 394-411. [30] KHAIRY M, ALAWNEH A, BARNES A, et al. SIMR: single instruction multiple request processing for energy-efficient data center microservices[C]//Proceedings of the 55th IEEE/ACM International Symposium on Microarchitecture, Chicago, Oct 1-5, 2022. Piscataway: IEEE, 2022: 441-463. [31] 朱紫钰, 汤小春, 赵全. 面向CPU-GPU集群的分布式机器学习资源调度框架研究[J]. 西北工业大学学报, 2021, 39(3): 529-538. ZHU Z Y, TANG X C, ZHAO Q. A unified schedule policy of distributed machine learning framework for CPU-GPU cluster[J]. Journal of Northwestern Polytechnical University, 2021, 39(3): 529-538. [32] 汤小春, 朱紫钰, 毛安琪, 等. 数据密集作业在GPU集群上的调度算法研究[J]. 软件学报, 2022, 33(12): 4429-4451. TANG X C, ZHU Z Y, MAO A Q, et al. Algorithm of scheduling for data-intensive computing operations onto GPU cluster[J]. Journal of Software, 2022, 33(12): 4429-4451. [33] 傅懋钟, 胡海洋, 李忠金. 面向GPU集群的动态资源调度方法[J]. 计算机研究与发展, 2023, 60(6): 1308-1321. FU M Z, HU H Y, LI Z J. Dynamic resource scheduling method for GPU cluster[J]. Journal of Computer Research and Development, 2023, 60(6): 1308-1321. [34] LI M, XIAO W, YANG H, et al. EasyScale: elastic training with consistent accuracy and improved utilization on GPUs[C]//Proceedings of the 2023 International Conference for High Performance Computing, Networking, Storage and Analysis, Denver, Nov 12-17, 2023. New York: ACM, 2023: 1-14. [35] 罗刚毅, 钱柱中, 陆桑璐. 一种基于网络感知的虚拟机再调度算法[J]. 计算机学报, 2015, 38(5): 932-943. LUO G Y, QIAN Z Z, LU S L. A network-aware VM re-scheduling algorithm[J]. Chinese Journal of Computers, 2015, 38(5): 932-943. [36] CHIARAVIGLIO L, D??ANDREAGIOVANNI F, LANCELLOTTI R, et al. An approach to balance maintenance costs and electricity consumption in cloud data centers[J]. IEEE Transactions on Sustainable Computing, 2018, 3(4): 274-288. [37] ISLAM M T, KARUNASEKERA S, BUYYA R. Performance and cost-efficient spark job scheduling based on deep reinforcement learning in cloud computing environments[J]. IEEE Transactions on Parallel and Distributed Systems, 2022, 33(7): 1695-1710. [38] TANG X, CAO W, TANG H, et al. Cost-efficient workflow scheduling algorithm for applications with deadline constraint on heterogeneous clouds[J]. IEEE Transactions on Parallel and Distributed Systems, 2022, 33(9): 2079-2092. [39] OSYPANKA P, NAWROCKI P. Qos-aware cloud resource prediction for computing services[J]. IEEE Transactions on Services Computing, 2023, 16(2): 1346-1357. [40] ZHAO G, WANG J, XU H, et al. Joint request updating and elastic resource provisioning with QoS guarantee in clouds[J]. IEEE/ACM Transactions on Networking, 2024, 32(1): 110-126. [41] 杨清波, 陈振宇, 刘东, 等. 基于容器的调控云PaaS平台的设计与实现[J]. 电网技术, 2020, 44(6): 2030-2037. YANG Q B, CHEN Z Y, LIU D, et al. Design and implementation of dispatching and control cloud PaaS platform based on container[J]. Power System Technology, 2020, 44(6): 2030-2037. [42] KAN C. DoCloud: an elastic cloud platform for Web applications based on Docker[C]//Proceedings of the 18th International Conference on Advanced Communication Technology, PyeongChang, Jan 31-Feb 3, 2016. Piscataway: IEEE, 2016: 478-483. [43] HE Z. Novel container cloud elastic scaling strategy based on Kubernetes[C]//Proceedings of the IEEE 5th Information Technology and Mechatronics Engineering Conference, Chongqing, Jun 12-14, 2020. Piscataway: IEEE, 2020: 1400-1404. [44] LI K, JI Y, LIU S, et al. ACEA: a queueing model-based elastic scaling algorithm for container cluster[J]. Wireless Communications and Mobile Computing, 2021(1): 6621094. [45] CAI Z, BUYYA R. Inverse queuing model-based feedback control for elastic container provisioning of Web systems in Kubernetes[J]. IEEE Transactions on Computers, 2021, 71(2): 337-348. [46] BENI E H, TRUYEN E, LAGAISSE B, et al. Reducing cold starts during elastic scaling of containers in Kubernetes[C]//Proceedings of the 36th Annual ACM Symposium on Applied Computing, Mar 22-26, 2021. New York: ACM, 2021: 60-68. [47] CHEN W, PI A, WANG S, et al. Pufferfish: container-driven elastic memory management for data-intensive applications[C]//Proceedings of the 19th ACM Symposium on Cloud Computing, Santa Cruz, Nov 20-23, 2019. New York: ACM, 2019: 259-271. [48] YU J, FENG D, TONG W, et al. CERES: container-based elastic resource management system for mixed workloads[C]//Proceedings of the 50th International Conference on Parallel Processing, Lemont, Aug 9-12, 2021. New York: ACM, 2021: 1-10. [49] CHOI J, CHO M, KIM J S. Employing vertical elasticity for efficient big data processing in container-based cloud environments[J]. Applied Sciences, 2021, 11(13): 6200. [50] MAO Y, SHARMA V, ZHENG W, et al. Elastic resource management for deep learning applications in a container cluster[J]. IEEE Transactions on Cloud Computing, 2023, 11(2): 2204-2216. [51] STRUHáR V, CRACIUNAS S S, ASHJAEI M, et al. Hierarchical resource orchestration framework for real-time containers[J]. ACM Transactions on Embedded Computing Systems, 2024, 23(1): 1-24. [52] ROSSI F, CARDELLINI V, PRESTI F L. Elastic deployment of software containers in geo-distributed computing environments[C]//Proceedings of the 24th IEEE Symposium on Computers and Communications, Barcelona, Jun 29-Jul 3, 2019. Piscataway: IEEE, 2019: 1-7. [53] ROSSI F, NARDELLI M, CARDELLINI V. Horizontal and vertical scaling of container-based applications using reinforcement learning[C]//Proceedings of the IEEE 12th International Conference on Cloud Computing, Milan, Jul 8-13, 2019. Piscataway: IEEE, 2019: 329-338. [54] ROSSI F, CARDELLINI V, PRESTI F L, et al. Dynamic multi-metric thresholds for scaling applications using reinforcement learning[J]. IEEE Transactions on Cloud Computing, 2023, 11(2): 1807-1821. [55] MEISNER D, GOLD B T, WENISCH T F. PowerNap: eliminating server idle power[J]. ACM SIGARCH Computer Architecture News, 2009, 37(1): 205-216. [56] KRIOUKOV A, MOHAN P, ALSPAUGH S, et al. NapSAC: design and implementation of a power-proportional web cluster[J]. ACM SIGCOMM Computer Communication Review, 2011, 14(1): 102-108. [57] GANDHI A, HARCHOL-BALTER M, RAGHUNATHAN R, et al. AutoScale: dynamic, robust capacity management for multi-tier data centers[J]. ACM Transactions on Computer Systems, 2012, 30(4): 1-26. [58] 林彬, 李姗姗, 廖湘科, 等. Seadown: 一种异构MapReduce集群中面向SLA的能耗管理方法[J]. 计算机学报, 2013, 36(5): 977-987. LIN B, LI S S, LIAO X K, et al. Seadown: SLA-aware size-scaling power management in heterogeneous MapReduce cluster[J]. Chinese Journal of Computers, 2013, 36(5): 977-987. [59] ENTRIALGO J, MEDRANO R, GARCíA D F, et al. Autonomic power management with self-healing in server clusters under QoS constraints[J]. Computing, 2016, 98(9): 871-894. [60] 廖彬, 张陶, 于炯, 等. MapReduce能耗建模及优化分析[J]. 计算机研究与发展, 2016, 53(9): 2107-2131. LIAO B, ZHANG T, YU J, et al. Energy consumption modeling and optimization analysis for MapReduce[J]. Journal of Computer Research and Development, 2016, 53(9): 2107-2131. [61] 杨挺, 王萌, 张亚健, 等. 云计算数据中心HDFS差异性存储节能优化算法[J]. 计算机学报, 2019, 42(4): 721-735. YANG T, WANG M, ZHANG Y J, et al. HDFS differential storage energy-saving optimal algorithm in cloud data center[J]. Chinese Journal of Computers, 2019, 42(4): 721-735. [62] GHETAS M. A multi-objective monarch butterfly algorithm for virtual machine placement in cloud computing[J]. Neural Computing and Applications, 2021, 33: 11011-11025. [63] BARTHWAL V, RAUTHAN M M S. AntPu: a meta-heuristic approach for energy-efficient and SLA aware management of virtual machines in cloud computing[J]. Memetic Computing, 2021, 13: 91-110. [64] 梁毅, 丁振兴, 赵昱, 等. 一种面向分布式深度学习系统的资源及批尺寸协同配置方法[J]. 计算机学报, 2022, 45(2): 302-316. LIANG Y, DING Z X, ZHAO Y, et al. A collaborative method for resource allocation and batch sizing on distributed deep learning system[J]. Chinese Journal of Computers, 2022, 45(2): 302-316. [65] GANDHI A, HARCHOL-BALTER M, DAS R, et al. Optimal power allocation in server farms[J]. ACM SIGMETRICS Performance Evaluation Review, 2009, 37(1): 157-168. [66] 胡亚红, 邱圆圆, 毛家发. 分布式异构集群中节点优先级调优算法[J]. 国防科技大学学报, 2022, 44(5): 102-113. HU Y H, QIU Y Y, MAO J F. Node priority optimization in distributed heterogeneous clusters[J]. Journal of National University of Defense Technology, 2022, 44(5): 102-113. [67] 胡亚红, 吴寅超, 朱正东. 节点实时性能自适应的集群资源分配算法[J]. 国防科技大学学报, 2022, 44(6): 144-150. HU Y H, WU Y C, ZHU Z D. Node real-time performance adaptive cluster resource scheduling algorithm[J]. Journal of National University of Defense Technology, 2022, 44(6): 144-150. [68] 毛安琪, 汤小春, 丁朝, 等. 集中式集群资源调度框架的可扩展性优化[J]. 计算机研究与发展, 2021, 58(3): 497-512. MAO A Q, TANG X C, DING Z, et al. Scalability for monolithic schedulers of cluster resource management framework[J]. Journal of Computer Research and Development, 2021, 58(3): 497-512. [69] TIAN C, LI L, SHI Z, et al. HARMONY: heterogeneity-aware hierarchical management for federated learning system[C]//Proceedings of the 55th IEEE/ACM International Symposium on Microarchitecture, Chicago, Oct 1-5, 2022. Piscataway: IEEE, 2022: 631-645. [70] 李少波, 杨磊, 李传江, 等. 联邦学习概述:技术、应用及未来[J]. 计算机集成制造系统, 2022, 28(7): 2119-2138. LI S B, YANG L, LI C J, et al. Overview of federated learning: technology, applications and future[J]. Computer Integrated Manufacturing System, 2022, 28(7): 2119-2138. [71] 吴再龙, 王利明, 徐震, 等. GPU虚拟化技术及其安全问题综述[J]. 信息安全学报, 2022, 7(2): 30-58. WU Z L,WANG L M, XU Z, et al. GPU virtualization technology and security issues: a survey[J]. Journal of Cyber Security, 2022, 7(2): 30-58. [72] JIANG J, QI J, SHEN T, et al. CRONUS: fault-isolated, secure and high-performance heterogeneous computing for trusted execution environment[C]//Proceedings of the 55th IEEE/ACM International Symposium on Microarchitecture, Chicago, Oct 1-5, 2022. Piscataway: IEEE, 2022: 124-143. [73] 周悦芝, 张迪. 近端云计算:后云计算时代的机遇与挑战[J]. 计算机学报, 2019, 42(4): 677-700. ZHOU Y Z, ZHANG D. Near-end cloud computing: opportunities and challenges in the post-cloud computing era[J]. Chinese Journal of Computers, 2019, 42(4): 677-700. [74] 王其朝, 金光淑, 李庆, 等. 工业边缘计算研究现状与展望[J]. 信息与控制, 2021, 50(3): 257-274. WANG Q Z, JIN G S, LI Q, et al. Industrial edge computing: vision and challenges[J]. Information and Control, 2021, 50(3): 257-274. [75] KIM S, ZHAO J, ASANOVIC K, et al. AuRORA: virtualized accelerator orchestration for multi-tenant workloads[C]//Proceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture, Toronto, Oct 2-Nov 1, 2023. New York: ACM, 2023: 62-76. |
No related articles found! |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||
/D:/magtech/JO/Jwk3_kxyts/WEB-INF/classes/