Journal of Frontiers of Computer Science and Technology ›› 2013, Vol. 7 ›› Issue (4): 315-325.DOI: 10.3778/j.issn.1673-9418.1212013

Previous Articles     Next Articles

Research and Implementation on Erasure Code in Cloud File System

CHENG Zhendong1+, LUAN Zhongzhi1, MENG You1, LI Liangshu1, HE Rong1, YANG Tingting1, QIAN Depei1, GUAN Gang2, CHEN Wei2   

  1. 1. Sino-German Joint Software Institute, School of Computer Science and Engineering, Beihang University, Beijing 100191, China
    2. Tencent Corporation, Shenzhen, Guangdong 518057, China
  • Online:2013-04-01 Published:2013-04-02

云文件系统中纠删码技术的研究与实现

程振东1+,栾钟治1,孟  由1,李亮淑1,和  荣1,杨婷婷1,钱德沛1,管  刚2,陈  伟2   

  1. 1. 北京航空航天大学 计算机学院 中德联合软件研究所,北京 100191
    2. 深圳市腾讯计算机系统有限公司,广东 深圳 518057

Abstract: Cloud file system has become the core and foundation of cloud storage and large data for its high performance, high scalability and high availability. Cloud file system generally uses replication technique to enhance its fault tolerance, improve efficiency in the use of data resources and system performance. However, the storage overhead of replication grows linearly with the number of replicas. And the replication costs extra writing bandwidth and management overhead. Erasure codes with reasonable redundancy coding can qualify high data reliability and availability without adding excessive amounts of storage space. This paper studies the technologies of erasure codes for applications of cloud file system, including erasure codes type, coding object, coding time, data modification, data access method and data access performance. Then, it discusses the challenge and tradeoff to design erasure codes for cloud file system. Finally, it designs and implements an erasure coding prototype for cloud file system. The experiments show that erasure codes in cloud file system can effectively protect the data availability of cloud file system, and save storage space.

Key words: cloud computing, cloud file system, redundancy, erasure code

摘要: 云文件系统凭借高性能、高扩展、高可用、易管理等特点,成为云存储和大数据的基础和核心。云文件系统一般采用完全副本技术来提升容错能力,提高数据资源的使用效率和系统性能。但完全副本的存储开销随着副本数目的增加呈线性增长,存储副本时造成额外的写带宽和数据管理开销。纠删码在没有增加过量的存储空间的基础上,通过合理的冗余编码来保证数据的高可靠性和可用性。研究了纠删码技术在云文件系统中的应用,从纠删码类型、编码对象、编码时机、数据更改、数据访问方式和数据访问性能等六个方面,对云文件系统中纠删码的设计进行了探究,以增强云文件系统的存储模型。在此基础上,设计并实现了纠删码原型系统,并通过实验证明了纠删码能有效地保障云文件系统的数据可用性,并且节省存储空间。

关键词: 云计算, 云文件系统, 冗余, 纠删码