计算机科学与探索 ›› 2016, Vol. 10 ›› Issue (3): 414-424.DOI: 10.3778/j.issn.1673-9418.1505005

• 人工智能与模式识别 • 上一篇    下一篇

利用边信息的混合距离学习算法

郭瑛洁+,王士同   

  1. 江南大学 数字媒体学院,江苏 无锡 214122
  • 出版日期:2016-03-01 发布日期:2016-03-11

Hybrid Distance Metric Learning Method with Side-Information

GUO Yingjie+, WANG Shitong   

  1. School of Digital Media, Jiangnan University, Wuxi, Jiangsu 214122, China
  • Online:2016-03-01 Published:2016-03-11

摘要: 使用边信息进行距离学习的方法在许多数据挖掘应用中占有重要位置,而传统的距离学习算法通常使用马氏距离形式的距离函数从而具有一定的局限性。提出了一种基于混合距离进行距离学习的方法,数据集的未知距离度量被表示为若干候选距离的线性组合,利用数据的边信息学习得到各距离所占权值从而得到新的距离函数,并将该距离函数应用于聚类算法以验证其有效性。通过与其他已有的距离学习方法进行对比,基于UCI(University of California,Irvine)数据集的实验结果证明了该算法具有明显的优势。

关键词: 距离学习, 混合距离, 距离函数, 边信息

Abstract: Learning distance function with side-information plays a key role in many data mining applications. Conventional metric learning approaches often use the distance function which is represented in form of Mahalanbobis distance which has some limitations. This paper proposes a new metric learning method with hybrid distance. In detail, the unknown distance metric is represented as the linear combination of several candidate distance metrics. A new distance function is achieved by learning weights with side-information. This paper applies the new distance function into clustering algorithm to verify the effectiveness. It also chooses the datasets from UCI machine learning repository to do experiments. The comparison with approaches for learning distance functions with side-information reveals the advantages of the proposed techniques.

Key words: distance metric learning, hybrid distance metric, distance function, side-information