Journal of Frontiers of Computer Science and Technology ›› 2017, Vol. 11 ›› Issue (1): 37-45.DOI: 10.3778/j.issn.1673-9418.1511043

Previous Articles     Next Articles

Research on Knowledge Graph Based Query Expansion Model and Its Retrieval Stability

HAO Linxue, ZHANG Peng+, SONG Dawei, HOU Yuexian   

  1. Tianjin Key Laboratory of Cognitive Computing and Application, Tianjin University, Tianjin 300350, China
  • Online:2017-01-01 Published:2017-01-10

融合知识图谱的查询扩展模型及其稳定性研究

郝林雪,张  鹏+,宋大为候越先   

  1. 天津大学 天津市认知计算与应用重点实验室,天津 300350

Abstract: This paper aims to construct a query expansion model based on query-related entities and their properties in Freebase, which are used to reconstruct the query for better expressing the user's needs. The relevance score between each property term and the query is measured by the risk-reward analysis in portfolio theory, which is expected to maximize the reward of the relevance scores of property terms and minimize the risk of query expansion failure using these property terms. This paper also proposes a method to integrate these entities and associated properties into the language modeling framework for query expansion. In the experiments, the retrieval effectiveness and stability of the query expansion model solely based on Freebase are evaluated on two Web collections, in comparison with the baseline language model LM and the traditional query expansion model based on pseudo relevance feedback RM3. The   experimental results show that the expansion model proposed in this paper outperforms baseline LM by 6%~15% in MAP (mean average precision), and it also performs more effectively and stably than RM3.

Key words: knowledge graph, Freebase, query expansion, effectiveness, stability

摘要: 旨在构建一种基于知识图谱Freebase的查询扩展模型,通过从Freebase中抽取与查询相关的若干实体及实体属性作为扩展词来重构查询,从而更好地表达用户的信息需求。在计算扩展词权重时,参考了投资组合理论中收益-风险分析方法,最大化扩展词和查询的相关性收益,同时也最小化扩展词可能带来的查询漂移的风险。最后将查询相关的实体和实体属性作为两种特征和查询语言模型结合实现查询扩展。在两个Web数据集上进行实验,用来检验所提出的扩展模型对检索系统的有效性和稳定性的影响。实验结果表明,提出的查询扩展模型与一元语言模型LM相比,检索结果的平均准确率(mean average precision,MAP)在两个数据集上有6%至15%的显著提升;和基于伪相关反馈的查询扩展模型RM3相比,有效性及稳定性都有不同程度的提升。

关键词: 知识图谱, Freebase, 查询扩展, 有效性, 稳定性