计算机科学与探索 ›› 2011, Vol. 5 ›› Issue (9): 781-790.

• 学术研究 • 上一篇    下一篇

不确定数据库上的top-k关键字查询

张 徵, 杨卫东, 朱 皓   

  1. 复旦大学 计算机科学技术学院, 上海 201203

  • 收稿日期:1900-01-01 修回日期:1900-01-01

Top-k Keyword Query on Uncertain Database

ZHANG Zheng, YANG Weidong, ZHU Hao

  

  1. School of Computer Science, Fudan University, Shanghai 201203, China
  • Received:1900-01-01 Revised:1900-01-01

摘要: 关系数据库上的关键字检索和不确定数据处理过去一直是两个独立的研究方向。研究了运用关键字方法检索不确定数据的问题, 定义了不确定关键字查询的基本模型和语义, 提出了一种在属性级粒度的不确定数据库上进行top-k关键字检索的算法。该算法根据用户指定的k值, 计算并返回分数最高的前k个结果, 其查询结果的评价函数综合考虑了结果与关键字的相关度和结果在可能世界语义下的概率大小。对算法进行了优化, 显著降低了计算复杂度。最后通过实验, 证明了算法的高效性和实用性。

关键词: 关键字检索, 不确定, top-k, 可能世界

Abstract: The problems of keyword search on relational databases and uncertain data management have been considered extensively, however addressed in isolation in the past. This paper introduces a novel method that combines IR-style keyword query with uncertain relational databases, and defines an uncertain model and its query semantics. The paper also shows a top-k algorithm to perform keyword search query on the attribute level, and return k query results which have maximum rank scores. Rank score of a query result is well-defined, depending on its correlation with query keywords and its possibility under the possible world. An optimized algorithm is introduced to reduce the complexity of the top-k query. The experimental results demonstrate the practicality and efficiency of these methods.

Key words: keyword search, uncertainty, top-k, possible world