Journal of Frontiers of Computer Science and Technology ›› 2019, Vol. 13 ›› Issue (6): 941-949.DOI: 10.3778/j.issn.1673-9418.1807054

Previous Articles     Next Articles

Main-Memory Database Log Recovery Techniques with Data Popularity in NUMA Architecture

WU Gang1,2, Abudurexiti REHEMAN1+, LI Liang1, QIAO Baiyou1, HAN Donghong1   

  1. 1. School of Computer Science and Engineering, Northeastern University, Shenyang 110819, China
    2. State Key Laboratory for Novel Software Technology, Nanjing University, Nanjing 210093, China
  • Online:2019-06-01 Published:2019-06-14

NUMA架构下数据热度的内存数据库日志恢复技术

吴  刚1,2,阿卜杜热西提·热合曼1+,李  梁1,乔百友1,韩东红1   

  1. 1.东北大学 计算机科学与工程学院,沈阳 110819
    2.南京大学 计算机软件新技术国家重点实验室,南京 210093

Abstract: In the main-memory database failure recovery technology, the command logging is a coarse-grained, lightweight logging method designed for main-memory databases. However, when it is used for failure recovery in data-oriented database design under the non-uniform memory access (NUMA) architecture, the CPU thread load responsible for high-frequency data recovery is increased and the other CPUs are relatively idle due to uneven data access frequency. In view of the increasing recovery time overhead caused by this unbalanced workload, this paper proposes a main-memory database log recovery algorithm based on the data popularity in NUMA architecture. In this algorithm, the number of accesses to each data is recorded as the popularity of the data. In parallel recovery, according to the data popularity, the data are more evenly distributed to the CPU thread on each node to perform the recovery operation, so as to improve the database recovery speed. The experimental results show that this solution is faster than the conventional recovery scheme of NUMA architecture. Moreover, the higher the data popularity, the  more obvious the improvement in recovery speed, and the highest increase is 19%.

Key words: main-memory database, logging, checkpoint, failure recovery, non-uniform memory access (NUMA) architecture

摘要: 在内存数据库故障恢复技术中,命令日志是针对内存数据库设计的粗粒度的、轻量级的日志记录方式。但在非统一内存访问(non-uniform memory access,NUMA)体系架构下面向数据的数据库设计中利用命令日志进行故障恢复时,由于数据访问频率不均衡,导致负责高频数据恢复的CPU线程负载加重,而其他CPU相对空闲。针对这种工作负载不均衡所导致的恢复时间开销增大的情况,提出了NUMA体系架构下基于热度记录的内存数据库日志恢复算法。该算法中,每一条数据的访问次数作为该数据的热度记录下来。在并行恢复时,根据数据热度,将数据比较均衡地划分到各个节点的CPU线程执行恢复操作,以此来提高数据库的恢复速度。实验结果表明,该方案比NUMA架构下的常规恢复方案快,而且数据的热度越高,恢复速度的提升越明显,最高提升了19%。

关键词: 内存数据库, 日志, 检查点, 故障恢复, 非统一内存访问(NUMA)架构