计算机科学与探索 ›› 2012, Vol. 6 ›› Issue (2): 97-108.DOI: 10.3778/j.issn.1673-9418.2012.02.001

• 学术研究 • 上一篇    下一篇

弹性分布式缓存动态扩展方法研究

朱 鑫, 秦秀磊, 王联华, 张文博, 钟 华   

  1. 1. 中国科学院 软件研究所 软件工程技术研究开发中心, 北京 100190
    2. 中国科学院 研究生院, 北京 100190
    3. 北京神舟航天软件技术有限公司, 北京 100094
  • 出版日期:2012-02-01 发布日期:2012-02-01

Research on Dynamic Scaling of Elastic Distributed Cache Systems

ZHU Xin, QIN Xiulei, WANG Lianhua, ZHANG Wenbo, ZHONG Hua   

  1. 1. Technology Center of Software Engineering, Institute of Software, Chinese Academy of Sciences, Beijing 100190, China 2. Graduate University, Chinese Academy of Sciences, Beijing 100190, China
    3. Beijing Shenzhou Aerospace Software Technology Co., Ltd., Beijing 100094, China
  • Online:2012-02-01 Published:2012-02-01

摘要: 对弹性分布式缓存动态扩展机制实现中的关键问题进行了研究。针对动态扩展时的数据重均衡问题,提出了一种适用于异构环境的热点感知的数据重均衡算法(hotspot sensitive data rebalancing algorithm, HSDRA)。该算法同时考虑内存占用和网络流量的均衡,在线识别热点分区,优先确保其在各缓存节点间均衡分布。针对动态扩展时缓存服务的数据一致性和持续可用性保障问题,分别提出了一种基于两阶段请求的数据访问协议和一种受控的数据迁移算法。实验结果表明,该方法能够在保障数据一致性和持续可用性的要求下实现缓存系统的动态扩展,HSDRA算法与未考虑各分区实际负载的加权静态数据重均衡算法相比响应时间更短。

关键词: 分布式缓存, 动态扩展, 热点数据, 数据迁移

Abstract: This paper focuses on how to dynamically scale the cache system. Firstly, as to the data rebalancing problem, it proposes a hotspot sensitive data rebalancing algorithm (HSDRA), which can be applied in heterogeneous environ-ment. HSDRA identifies hotspot partitions and gives priority to ensuring their uniform distribution across the cache servers while taking into account both memory footprint and network traffic. Then, as to the problem how to ensure data consistency and continuous availability of cache system in dynamic scaling, it proposes a data access protocol which is based on a two-phase request manner and a controlled data migration algorithm respectively. The experimental results show that the proposed approach can enable the cache system to scale dynamically under the condition that data consistency and continuous availability are guaranteed, and HSDRA outperforms the weighted static data rebalancing algorithm which doesn’t consider actual load on each cache partition.

Key words: distributed cache, dynamic scaling, hotspot, data migration