Journal of Frontiers of Computer Science and Technology ›› 2014, Vol. 8 ›› Issue (1): 51-60.DOI: 10.3778/j.issn.1673-9418.1305038

Previous Articles     Next Articles

Design of Multi-Dimensional Information Network Datawarehouse Model for Online Graph Processing

NIE Zhangyan1,2,3, LI Chuan1,2,3+, TANG Changjie1,2, XU Hongyu1,2, ZHANG Yonghui1,2, YANG Ning1,2   

  1. 1. College of Computer Science, Sichuan University, Chengdu 610065, China
    2. National Key Laboratory of Air Control Automation System Technology, Chengdu 610065, China
    3. State Key Laboratory of Software Engineering, Wuhan University, Wuhan 430072, China
  • Online:2014-01-01 Published:2014-01-03

面向OLGP的多维信息网络数据仓库模型设计

聂章艳1,2,3,李  川1,2,3+,唐常杰1,2,徐洪宇1,2,张永辉1,2,杨  宁1,2   

  1. 1. 四川大学 计算机学院,成都 610065
    2. 国家空管自动化系统技术重点实验室,成都 610065
    3. 武汉大学 软件工程国家重点实验室,武汉 430072

Abstract: With the emergence of information network,the information evolves from simple numerical data to complex graph network. How to organize and store the information network data becomes an urging problem. This paper proposes a multi-dimension information network datawarehouse model (MINDM), which aims to provide the data foundation to online graph processing. The MINDM includes edge fact table, node fact table, information link attribution table and topology node attribution table. The experimental results show that the MINDM can eliminate redundancy, reduce the cost of average query time, and save the space storage. The query time remains stable within a few milliseconds while performing queries on the 12.5 thousand ACM papers real dataset, keeping sharp comparison to van relation model with more than hundreds of milliseconds for the same processing stage. With the number of papers growing, the storage space of the proposed model increases much slower than the van relation model.

Key words: InfoNetwork, informational dimension, topological dimension, online graph processing, multi-dimensional information network datawarehouse model

摘要: 信息网络的出现使信息由简单的数值型数据演化成较复杂的图网络结构。如何对基于图的信息网络数据进行良好的组织和存储成为一个亟待解决的问题。利用维建模的方法对基于图的信息网络数据进行模型设计,提出了多维信息网络仓库模型。该模型由边事实表、节点事实表、信息维连接属性表以及拓扑维节点属性表组成,能够为在线图处理提供底层的数据平台。实验表明该模型在消除冗余、查询时间、存储空间上均较泛关系表有明显优势。新模型在1.25万篇ACM论文上的查询时间稳定在几十毫秒,较泛关系表的查询时间约减少一个数量级。在空间性能上,随着论文数量的增加,该模型存储空间开销的增长速度远小于泛关系表的增长速度。

关键词: 信息网络, 信息维, 拓扑维, 在线图处理, 多维信息网络数据仓库模型