计算机科学与探索 ›› 2025, Vol. 19 ›› Issue (3): 582-601.DOI: 10.3778/j.issn.1673-9418.2405013

• 前沿·综述 • 上一篇    下一篇

多元时间序列聚类算法综述

郑德生,孙涵明,王立远,段垚鑫,李晓瑜   

  1. 1. 西南石油大学 计算机与软件学院,成都 610500
    2. 重庆邮电大学 自动化学院,重庆 400065
    3. 电子科技大学 信息与软件工程学院,成都 611731
  • 出版日期:2025-03-01 发布日期:2025-02-28

Review of Multivariate Time Series Clustering Algorithms

ZHENG Desheng, SUN Hanming, WANG Liyuan, DUAN Yaoxin, LI Xiaoyu   

  1. 1. School of Computer Science and Software Engineering, Southwest Petroleum University, Chengdu 610500, China
    2. School of Automation, Chongqing University of Posts and Telecommunications, Chongqing 400065, China
    3. School of Information and Software Engineering, University of Electronic Science and Technology of China, Chengdu 611731, China
  • Online:2025-03-01 Published:2025-02-28

摘要: 多元时间序列(MTS)作为众多领域智能化技术的关键数据依据,其随时间推移记录了系统中多个变量的状态变化。聚类技术作为一个数据挖掘核心工具可以将数据按照其结构相似性划分为不同的簇,通过识别数据的结构和内在关系挖掘系统发展规律和变量相关关系。面对多元时间序列数据结构的复杂性、变量之间的关联性以及数据高维性等为聚类分析带来的挑战,国内外已经开展了大量相关研究工作。鉴于此,对多元时间序列数据场景下的聚类分析算法进行综述。基于特征提取方式、相似性度量算法、聚类划分框架等分类标准,对现有多元时间序列聚类算法进行对比分析。对于每一类多元时间序列聚类技术,从算法原理、代表性方法、算法优缺点以及解决的问题等方面进行详细总结与剖析。进一步讨论了常用的评价标准,以及多元时间序列聚类相关公开数据集。从多变量时序数据结构特殊性出发对现有多元时间序列聚类存在的挑战及未来发展方向进行了总结与展望。

关键词: 多元时间序列, 聚类算法, 特征表示, 相似性度量, 聚类评估指标

Abstract: Multivariate time series (MTS) data, serving as a crucial basis for intelligent technologies across numerous domains, record the state changes of multiple variables in systems over time. Clustering technique, as a core tool in data mining, can partition data into different clusters based on structural similarity, thereby uncovering the structure and internal relationships within data to discover systemic development patterns and variable correlations. Faced with the challenges such as the complexity of multivariate time series data structures, the interconnectivity between variables, and data high-dimensionality, a substantial amount of research has been conducted internationally. This paper provides an overview of clustering analysis algorithms for multivariate time series data scenarios. Initially, based on classification standards such as feature extraction methods, similarity measurement algorithms, and clustering partition frameworks, this paper conducts a comparative analysis of existing multivariate time series clustering algorithms. For each category of detection technology, a detailed summary is provided, covering algorithm principles, representative methods, advantages and disadvantages, and the problems they address. Further discussion includes common evaluation standards and publicly available datasets related to multivariate time series clustering. Lastly, from the perspective of the unique structure of multivariate temporal data, this paper outlines several challenging issues and future research directions.

Key words: multivariate time series, clustering algorithm, feature representation, similarity measure, clustering evaluation index