计算机科学与探索 ›› 2010, Vol. 4 ›› Issue (5): 401-409.DOI: 10.3778/j.issn.1673-9418.2010.05.002

• 学术研究 • 上一篇    下一篇

Sub-Join:面向闪存数据库的查询优化算法*

梁智超;周 大; 孟小峰+   

  1. 中国人民大学 信息学院, 北京 100872
  • 收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2010-05-11 发布日期:2010-05-11
  • 通讯作者: 孟小峰

Sub-Join: Query Optimization Algorithm for Flash-Based Database*

LIANG Zhichao;ZHOU Da; MENG Xiaofeng+   

  1. Information School, Renmin University of China, Beijing 100872, China
  • Received:1900-01-01 Revised:1900-01-01 Online:2010-05-11 Published:2010-05-11
  • Contact: MENG Xiaofeng

摘要: 固态硬盘具有高速的随机读取速度、低功耗、体积小等特点, 被认为将取代磁盘成为新一代的数据存储设备。但是闪存数据库的查询性能的提高却远小于固态硬盘相比于磁盘I/O性能的提高, 其原因在于现有的数据库是基于磁盘设计的, 不能充分发挥固态硬盘的高速性能。提出一种名为子连接(Sub-Join)的连接算法。首先将数据表的连接列和主键投影为新的子表, 然后对子表进行接连操作, 最后根据子表的连接结果再从原始数据表中回取查询结果。通过和开源数据库Oracle Berkeley DB的比较实验, 结果表明子连接算法比原有算法的性能提高了40%~100%, 充分说明了它的优越性。

关键词: 固态硬盘, 闪存, 闪存数据库, 查询优化, 子连接

Abstract: Compared with hard drive disk (HDD), solid state disk (SSD) has a lot of advantages, such as high ran-dom read performance, low power consumption and lightweight form. Therefore it is envisioned to be next genera-tion data storage instead of HDD. However, the enhancement of query performance for flash-based database is not the same as the I/O ratio of SSD to HDD. The reason is existing databases which are designed for HDD can not take full advantage of high I/O performance of SSD. A new join algorithm, Sub-Join, is proposed. Sub-Join firstly pro-jects the column of join and primary key as Sub-Table, and then executes join operations on Sub-Tables. Finally re-sults are gotten from original table according to the result of join on Sub-Tables. The compared experiments with Oracle Berkeley DB show Sub-Join outperforms original indexed nested-loop join at the ratio of about 40%~100%, which shows the high efficiency of this method.

Key words: solid state disk (SSD), flash memory, flash-based database, query optimization, Sub-Join

中图分类号: