Journal of Frontiers of Computer Science and Technology ›› 2021, Vol. 15 ›› Issue (12): 2438-2448.DOI: 10.3778/j.issn.1673-9418.2105116

• Theory and Algorithm • Previous Articles    

Semi-supervised Clustering Method for Non-negative Functional Data

YAO Xiaohong, HUANG Hengjun   

  1. School of Statistics, Lanzhou University of Finance and Economics, Lanzhou 730020, China
  • Online:2021-12-01 Published:2021-12-10

非负半监督函数型聚类方法

姚晓红,黄恒君   

  1. 兰州财经大学 统计学院,兰州 730020

Abstract:

Functional clustering analysis is an important tool for exploring functional data. Most of the existing functional clustering methods are essentially unsupervised learning and do not take into account the label information of data. To resolve the issues of unsupervised characteristics of existing functional clustering methods and the non-negative characteristics of functional data, a semi-supervised non-negative functional clustering method (SSNFC) is proposed, focusing on processing clustering of non-negative functional data with a little label information. Firstly, the label information is integrated into the functional clustering by introducing the constrained non-negative matrix factorization (CNMF) technique, and a one-step model is constructed, which fuses the curve fitting, non-negative constraint and functional clustering into one objective function. Secondly, an iterative updating algorithm is con-ducted, and its local convergence and time complexity are discussed. Finally, the experimental results on simulation data, Growth data and TIMIT (Texas Instruments and Massachusetts Institute of Technology) speech data indicate that SSNFC is helpful for improving clustering performance compared with other unsupervised functional clustering methods.

Key words: functional data, clustering analysis, semi-supervised learning, constrained non-negative matrix factorization

摘要:

函数型聚类分析是探索函数型数据的重要工具,现有的函数型聚类方法大多属于无监督学习,没有考虑到数据的标签信息。针对目前函数型聚类方法的无监督特性,以及函数型数据通常具备的非负性特征,提出了一种非负半监督函数型聚类方法(SSNFC),用于处理带有少量标签信息的非负函数型数据的聚类问题。首先,通过引入约束非负矩阵分解(CNMF)技术,将标签信息融入函数型聚类过程中,构建了曲线拟合、非负约束和函数型聚类相统一的一步法模型。其次,给出了模型的迭代更新求解算法,证明了算法的局部收敛性,并分析了算法的时间复杂度。最后,在随机模拟数据、Growth数据和TIMIT语音数据的实验结果表明,与无监督函数型聚类方法相比较,提出的非负半监督函数型聚类方法SSNFC有助于提高聚类性能。

关键词: 函数型数据, 聚类分析, 半监督学习, 约束非负矩阵分解