Journal of Frontiers of Computer Science and Technology ›› 2014, Vol. 8 ›› Issue (1): 61-72.DOI: 10.3778/j.issn.1673-9418.1305048

Previous Articles     Next Articles

Research on Data Space Index Method Based on iMeMex Data Model

WANG Hongbin1, ZHOU Lianke1, WANG Nianbin1+, DENG Shengchun2   

  1. 1. College of Computer Science and Technology, Harbin Engineering University, Harbin 150001, China
    2. School of Software, Harbin Institute of Technology, Harbin 150001, China
  • Online:2014-01-01 Published:2014-01-03

基于iMeMex数据模型的数据空间索引方法研究

王红滨1,周连科1,王念滨1+,邓胜春2   

  1. 1. 哈尔滨工程大学 计算机科学与技术学院,哈尔滨 150001
    2. 哈尔滨工业大学 软件学院,哈尔滨 150001

Abstract:  Nowadays, the information of person and organization is still rising rapidly and the percentage of non-
structured data increases constantly. Data space consists of vast amounts of data which have characteristics, such as mass, distributivity, heterogeneity and autonomy etc, it is faced with huge challenge for users to obtain the interested information efficiently and quickly. Building an effective index method for the heterogeneous data in the data space is the foundation for addressing this challenge. Through analyzing the characteristics of iMeMex data model and query method of data space, this paper proposes an index method based on iMeMex data model to improve the query efficiency of the heterogeneous data in the data space, which extends the inverted list method. This novel index method supports and improves the keywords query, the predicates query and the path query by extending the keywords column and linked list node index resource view of the inverted list. The experimental results demonstrate the feasibility and effectiveness of the proposed method.

Key words: data space, index, iMeMex data model, inverted list

摘要: 目前,个人和组织的信息呈现急剧增长趋势,且非结构化数据所占比重在不断增加,这些属于某个主体的海量、分布、异构和共存的数据构成了一个异构数据空间,如何为用户提供高效、便捷和多样化的搜索查询服务是数据空间面临的巨大挑战,为数据空间中异构数据构建高效的索引方法是解决这一问题的基础。对iMeMex数据模型的特点和数据空间中查询方法进行了分析,在此基础上通过扩展倒排列表方法,提出了一种基于iMeMex数据模型的索引方法,来提高对数据空间中异构数据的搜索查询效率。新的索引方法通过扩展倒排列表的关键字列和链表节点信息索引资源视图,来支持和提高关键字查询、谓词查询和路径查询的处理效率。实验结果表明,该索引方法能够有效、可行地解决数据空间中异构数据索引和查询效率问题。

关键词: 数据空间, 索引, iMeMex数据模型, 倒排列表