计算机科学与探索 ›› 2017, Vol. 11 ›› Issue (10): 1531-1544.DOI: 10.3778/j.issn.1673-9418.1701044

• 综述·探索 • 上一篇    下一篇

纠删码存储系统中数据修复方法综述

杨松霖,张广艳+   

  1. 清华大学 计算机科学与技术系,北京 100084
  • 出版日期:2017-10-01 发布日期:2017-10-20

Review of Data Recovery in Storage Systems Based on Erasure Codes

YANG Songlin, ZHANG Guangyan+   

  1. Department of Computer Science and Technology, Tsinghua University, Beijing 100084, China
  • Online:2017-10-01 Published:2017-10-20

摘要: 纠删码技术具有存储开销低的优势,然而在进行数据修复时面临修复时间长和对前端应用性能影响高的缺陷。给出纠删码技术中数据修复完成时间的计算模型,指出影响修复性能的关键因素,进而选取计算开销、读写开销、传输开销作为修复性能的评价标准;分析了现有研究工作如何降低计算、读写和传输3种开销,重点讨论了其关键性技术的优缺点;最后从修复性能、可靠性、存储开销等方面对现有编码方案进行对比,并指出未来可能的研究方向。

关键词: 纠删码, 多副本, 数据修复, 性能优化

Abstract: Erasure codes have the advantage of low storage overhead. However, they also have the drawbacks of long recovery time and high impact on application performance. This paper presents the computation model of the time for data recovery with erasure codes, and identifies the key factors that affect the recovery performance. Thereafter, this paper chooses the computation overhead, read/write overhead, and transmission overhead as the evaluation criterion for the recovery performance. In addition, this paper analyzes how the latest efforts in this field reduce overheads from the aspects of computation, read/write, and transmission. Finally, this paper compares existing coding schemes from the aspects of recovery performance, reliability, as well as storage overhead, and then points out the future research directions.

Key words: erasure codes, multiple replicas, data recovery, performance improvement