计算机科学与探索 ›› 2010, Vol. 4 ›› Issue (6): 531-541.DOI: 10.3778/j.issn.1673-9418.2010.06.005

• 学术研究 • 上一篇    下一篇

内存存储模型上的多表连接优化技术研究*

张延松1,2+,于利胜1,2, 王 珊1,2, 陈 红1,2

  

  1. 1. 中国人民大学 数据工程与知识工程教育部重点实验室, 北京 100872
    2. 中国人民大学 信息学院, 北京 100872

  • 收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2010-06-18 发布日期:2010-06-18
  • 通讯作者: 张延松

Research on Optimization Technique in Multi-join Operation with Main-memory Storage Model*

ZHANG Yansong1,2+, YU Lisheng1,2, Wang Shan1,2, Chen Hong1,2

  

  1. 1. The MOE Key Lab of Data Engineering & Knowledge Engineering, Renmin University of China, Beijing 100872, China
    2. School of Information, Renmin University of China, Beijing 100872, China

  • Received:1900-01-01 Revised:1900-01-01 Online:2010-06-18 Published:2010-06-18
  • Contact: ZHANG Yansong

摘要: 分析了面向先进硬件平台上的数据库优化技术, 提出了基于内存存储模型的多表连接查询处理优化技术, 采用内存存储模型存储维表并对维表主键进行顺序化, 从而使维表的主键与内存维表记录的内存偏移地址相一致, 实现对维表记录的内存直接访问。通过列存储技术减少维表记录的访问宽度, 进一步优化维表访问的cache性能。与基于SQL Server 2005的查询执行计划的连接算法、join index连接算法以及基于列存储模型的优化连接算法进行了实验比较和性能分析, 结果表明:基于内存存储模型的多表连接算法在处理星型结构数据仓库多谓词、多连接的复杂查询时具有很好的性能, 与join index相比不需要额外的空间开销, 与列存储数据模型相比具有更好的兼容性和性能。

关键词: 内存维表, 连接消除技术, 多入口维表访问技术, 顺序相关存储结构

Abstract: On analysis of database techniques facing advanced hardware, a novel optimization technique in multi-join operation based on main-memory storage model is proposed which stores dimensional tables in main-memory. By serializing primary keys of dimensional tables, the primary keys are equal to offset values of main-memory dimensional tuples, so dimensional tuples can be directly accessed by start address of dimensional table plus offset value. Further optimization is proposed by decomposing dimensional tables into main-memory column structures with single or several fields of dimensional table in order to reduce tuple width for better cache performance. In experiments, simulating algorithms of traditional query plan of SQL Server 2005, join index algorithm and optimized column model join algorithm are compared for performance analysis. The results show that multi-join operation based on main-memory storage model algorithm has better performance in processing complex queries in star-schema data warehouse with multiple predicates and multiple join operations. Compared with join index algorithm, the perform-ance is equal and no additional space cost is needed. Compared with column storage model, this algorithm has better compatibility and performance, and it can also be implemented in traditional disk resident database systems.

Key words: memory dimensional table, join elimination technique, multi-entry dimensional table accessing technique, sequence conscious storage structure

中图分类号: