计算机科学与探索 ›› 2017, Vol. 11 ›› Issue (5): 708-719.DOI: 10.3778/j.issn.1673-9418.1603051

• 学术研究 • 上一篇    下一篇

标记分布学习中目标函数的选择

赵  权1,2,耿  新1,2+   

  1. 1. 东南大学 计算机网络和信息集成教育部重点实验室,南京 211189
    2. 东南大学 计算机科学与工程学院,南京 211189
  • 出版日期:2017-05-01 发布日期:2017-05-04

Selection of Target Function in Label Distribution Learning

ZHAO Quan1,2, GENG Xin1,2+   

  1. 1. Key Laboratory of Computer Network and Information Integration (Southeast University), Ministry of Education, Nanjing 211189, China
    2. School of Computer Science and Engineering, Southeast University, Nanjing 211189, China
  • Online:2017-05-01 Published:2017-05-04

摘要: 标记分布学习是近年提出的一种新的机器学习范式。从理论上来说,这一范式可以看作是对多标记学习的泛化。已有的研究表明标记分布学习是一种有效的学习范式,能够很好地解决某些标记多义性问题。针对标记分布学习,已有一些预测效果不错的专门算法被提出来。针对这些专门的标记分布学习算法提出了一种泛化标记分布学习框架。在这个框架中,一个专门的标记分布学习算法由目标函数、输出模型和优化方法三部分组成。针对这个泛化框架中的目标函数部分展开研究。为了研究选择不同的距离作为目标函数对标记分布学习算法预测效果的影响,选取7个代表性距离作为研究对象。通过对5个真实标记分布数据集上的实验结果进行分析,结合每个距离的特点,提出了一些选取目标函数的具体建议。

关键词: 标记分布学习, 最大熵模型, 拟牛顿法, 目标函数选择

Abstract: Label distribution learning is a new machine learning paradigm proposed in recent years. In theory, this paradigm can be seen as a generalization of multi-label learning paradigm. Previous studies show that label distribution learning paradigm is an effective learning paradigm. It can solve some label ambiguity problems effectively. For label distribution learning, a number of special algorithms which have good prediction effect have been proposed. For these special algorithms, this paper proposes a generalization frame of label distribution learning. In this learning frame, a special algorithm consists of three parts, they are target function, output model and optimization  algorithm. This paper studies the part of target function in this generalization frame. In order to study the relationship between prediction effect of a label distribution learning algorithm and different target functions, this paper selects 7 representative distances. Based on the characteristics of each distance and experiment results of 5 real label distribution learning datasets, this paper proposes some suggestions how to choose a target function.

Key words: label distribution learning, maximum entropy model, quasi-Newton method, selection of target functions