计算机科学与探索 ›› 2014, Vol. 8 ›› Issue (3): 257-265.DOI: 10.3778/j.issn.1673-9418.1311012

• 学术研究 • 上一篇    下一篇

基于云存储的块级连续数据保护系统

顾  瑜1,刘川意2,鞠大鹏3,汪东升3+   

  1. 1. 清华大学 计算机科学与技术系,北京 100084
    2. 北京邮电大学 可信分布式计算与服务教育部重点实验室,北京 100876
    3. 清华大学 信息科学与技术国家实验室,北京 100084
  • 出版日期:2014-03-01 发布日期:2014-03-05

Cloud Based Block-Level Continuous Data Protection System

GU Yu1, LIU Chuanyi2, JU Dapeng3, WANG Dongsheng3+   

  1. 1. Department of Computer Science and Technology, Tsinghua University, Beijing 100084, China
    2. Key Laboratory of Trustworthy Distributed Computing and Service of Ministry of Education, Beijing University of Posts and Telecommunications, Beijing 100876, China
    3. National Laboratory for Information Science and Technology, Tsinghua University, Beijing 100084, China
  • Online:2014-03-01 Published:2014-03-05

摘要: 提出了基于云存储的块级连续数据保护系统MYCDP。MYCDP利用灵活且性价比高的云资源存储业务数据,并采用全局数据去重技术压缩数据量,以取得更低的备份成本。使用针对性设计的版本索引结构和本地缓存机制优化低带宽、高延迟云环境下的数据恢复速度。给出了系统详细结构设计与具体流程描述,通过原型系统与其他方案系统的对比测试,验证了MYCDP能够取得比同样基于云存储的传统delta编码方案更低的备份成本和更快的恢复速度,且在多数实际恢复场景中具有与传统基于本地存储资源的方案相同甚至更好的数据恢复性能。

关键词: 连续数据保护(CDP), 云, 数据去重

Abstract: This paper proposes a novel cloud based block-level continuous data protection (CDP) system MYCDP to optimize both backup cost and recovery speed. MYCDP leverages flexible and cost-effective cloud resources as the back-end storage and adopts data deduplication mechanism to eliminate data redundancy globally, thus achieves much lower backup cost than traditional CDP approaches. It also employs a specific version index structure and a disk/memory hybrid cache to accelerate recovery processes in cloud environment with low bandwidth and high latency. This paper describes the system design and process procedures in detail. The experimental results demonstrate that MYCDP can achieve both much lower cost and much faster recovery than cloud based delta-encoding CDP approaches, while remaining same or even better recovery performance over traditional local based CDP approaches.

Key words: continuous data protection (CDP), cloud, data deduplication