计算机科学与探索 ›› 2025, Vol. 19 ›› Issue (7): 1931-1944.DOI: 10.3778/j.issn.1673-9418.2412086

• 人工智能·模式识别 • 上一篇    下一篇

时空交叉注意力特征融合的交通流量预测模型

孟祥福,徐永杰,翁雪   

  1. 辽宁工程技术大学 电子与信息工程学院,辽宁 葫芦岛 125105
  • 出版日期:2025-07-01 发布日期:2025-06-30

Spatio-Temporal Cross-Attention Feature Fusion Model for Traffic Flow Prediction

MENG Xiangfu, XU Yongjie, WENG Xue   

  1. School of Electronic and Information Engineering, Liaoning Technical University, Huludao, Liaoning 125105, China
  • Online:2025-07-01 Published:2025-06-30

摘要: 交通流量预测是智能交通系统的核心组成部分,对高效的交通管理和规划至关重要。针对现有方法在动态时空依赖建模和特征表示方面存在的不足,提出了一种时空交叉注意力特征融合的交通流量预测模型。通过构建动态多特征嵌入模块,融合原始数据、周期性、空间和时空自适应嵌入,生成交通流量数据内在的时空特征表示,提升了模型对多样化交通模式的适应能力。基于Transformer编码器架构,设计并行的时空自注意力模块,高效提取时间和空间特征,为深度的特征融合提供了基础。创新性地引入时空交叉注意力特征融合机制,在时间和空间维度分别使用多头交叉注意力机制,使时间特征能够自适应地学习关键节点的空间信息,同时空间特征也能选择性地聚焦于重要的时间信息,以实现时间和空间特征的深度融合,从而更全面地理解和捕捉交通流量中的动态时空依赖关系。在四个真实交通数据集上的实验结果表明,与最优基线模型相比,所提模型的MAE、RMSE和MAPE指标分别平均降低了1.56%、1.91%和2.58%。

关键词: 交通流量预测, Transformer, 交叉注意力, 特征嵌入

Abstract: Traffic flow prediction is a core component of intelligent transportation systems and is crucial for efficient traffic management and planning. To address the shortcomings of existing methods in dynamic spatio-temporal dependency modeling and feature representation, this paper proposes a traffic flow prediction model based on spatio-temporal cross-attention feature fusion. Firstly, a dynamic multi-feature embedding module is constructed to integrate raw data, periodicity, spatial and spatio-temporal adaptive embeddings, generating intrinsic spatio-temporal feature representations of traffic flow data. This enhances the model??s adaptability to diverse traffic patterns. Secondly, based on the Transformer encoder architecture, a parallel spatio-temporal self-attention module is designed to efficiently extract temporal and spatial features, laying foundation for deep feature fusion. Finally, a spatio-temporal cross-attention feature fusion mechanism is innovatively introduced, employing a multi-head cross-attention mechanism in the temporal and spatial dimensions. This allows temporal features to adaptively learn spatial information from key nodes, while spatial features selectively focus on important temporal information, enabling deep fusion of temporal and spatial features. Consequently, the model achieves a more comprehensive understanding of dynamic spatio-temporal dependencies in traffic flow. Experiments on four real-world traffic datasets demonstrate that, compared with the best baseline model, the proposed model achieves average reductions of 1.56%, 1.91% and 2.58% in MAE, RMSE and MAPE.

Key words: traffic flow prediction, Transformer, cross-attention, feature embedding