计算机科学与探索 ›› 2015, Vol. 9 ›› Issue (12): 1430-1438.DOI: 10.3778/j.issn.1673-9418.1508024

• 数据库技术 • 上一篇    下一篇

近邻可逆性验证在异构数据相似性度量中的应用

杨  涛,韦世奎+,朱振峰,赵  耀   

  1. 北京交通大学 计算机与信息技术学院 信息科学研究所,北京 100044
  • 出版日期:2015-12-01 发布日期:2015-12-04

Application of Neighborhood Reversibility Verifying in Heterogeneous Data Similarity Measure

YANG Tao, WEI Shikui+, ZHU Zhenfeng, ZHAO Yao   

  1. Institute of Information Science, College of Computer and Information Technology, Beijing Jiaotong University, Beijing 100044, China
  • Online:2015-12-01 Published:2015-12-04

摘要: 在单媒体类型的同构数据对象(如图像)之间存在近邻不可逆现象,即对象A是对象B的K近邻,但B不是A的近邻。同样,对于异构数据对象,也存在近邻不可逆性问题。尽管关于同构数据的近邻可逆性验证问题在相关文献中已有报道,但是异构数据的近邻可逆性验证问题尚未见报道。着重研究了多媒体文档(multimedia document,MMD)间的近邻可逆性问题。首先,建立了一个跨媒体检索框架——LE-KNN(Laplacian eigenmaps-K nearest neighbors),用于实现跨媒体数据的检索。其次,基于该框架,引进了两种提升近邻可逆性的方法:一方面,重新定义多媒体文档的近邻关系矩阵,提升多媒体文档间的近邻可逆性;另一方面,在单模态检索框架下加入上下文相似性度量(contextual dissimilarity measure,CDM)算法,调整库中各媒体之间的距离,使得数据库中各数据之间的近邻距离尽可能相等,从而提升库中数据的近邻可逆性。实验数据证明,提升多媒体文档之间近邻可逆性有助于提高跨媒体检索的准确率。

关键词: 近邻可逆性, 多媒体文档, 跨媒体检索

Abstract: There is a phenomenon of neighborhood non-reversibility among single-media data objects. That is, object A is one of K neighbors of object B, but object B is not the neighbor of object A. In fact, similar phenomenon also exists among heterogeneous data objects. Although some good works had been made on neighborhood reversibility of single-media data, no effect has been paid on the neighborhood reversibility problem of heterogeneous data. This paper mainly focuses on the neighborhood reversibility problem of multimedia documents. Firstly, a cross-media retrieval framework, named LE-KNN (Laplacian eigenmaps-K nearest neighbors) is built, which is used to evaluate the effect of neighborhood reversibility in cross-media retrieval. Secondly, based on the framework, two methods are introduced to cross-media search. On the one hand, the neighborhood relationships of multimedia documents are redefined to improve the neighborhood reversibility of multimedia documents. On the other hand, contextual dissimilarity measure (CDM) algorithm is introduced to adjust the distances among single-media data objects in multimedia documents so as to improve the neighborhood reversibility of multimedia documents in dataset. The extensive experiments show that introducing the neighborhood reversibility verifying can remarkably improve the search accuracy of cross-media retrieval.

Key words: neighborhood reversibility, multimedia documents, cross-media retrieval