Journal of Frontiers of Computer Science and Technology ›› 2015, Vol. 9 ›› Issue (1): 105-111.DOI: 10.3778/j.issn.1673-9418.1405035

Previous Articles     Next Articles

Semi-Supervised Clustering Learning Combined with Feature Preferences

FANG Ling, CHEN Songcan+   

  1. College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing 210016, China
  • Online:2015-01-01 Published:2014-12-31

结合特征偏好的半监督聚类学习

方  玲,陈松灿+   

  1. 南京航空航天大学 计算机科学与技术学院,南京 210016

Abstract: Semi-supervised clustering is one of the important research subjects in the machine learning community. It guides semi-supervised clustering by using the label information of a small amount of data or the information of relative preference relations between features. However, the only single-facet information is considered as prior knowledge in existing semi-supervised clustering algorithms. It is relatively rare to jointly use information from two different facets in pattern and feature into semi-supervised clustering. To remedy such shortcoming, based on traditional semi-supervised clustering algorithms, this paper proposes an extended semi-supervised clustering algorithm by jointly exploiting both given feature preferences in feature facet and semi-supervised information of a small amount of data in pattern facet. The experimental results show its effectiveness.

Key words: semi-supervised learning, clustering, semi-supervised clustering, feature preferences, label information

摘要: 半监督聚类是机器学习的重要研究内容之一,它通过利用样本层面的少量标记数据信息或者利用特征层面的特征偏好信息来指导半监督聚类。但现有的半监督聚类算法仅考虑了单一层面的半监督先验信息,罕有同时考虑两个不同层面的此类信息进行半监督聚类。为了弥补这一遗漏,联合利用特征层面给定的特征偏好,即特征之间的相对重要性关系,并结合样本层面的少量标记数据等半监督信息,在传统的半监督聚类算法基础上发展出一个扩展型半监督聚类算法。初步实验验证了该算法的有效性。

关键词: 半监督学习, 聚类, 半监督聚类, 特征偏好, 标记信息