计算机科学与探索

• 学术研究 •    下一篇

求解旅行商问题的GCN-Pointransformer模型

邱云飞, 刘一菲, 于智龙, 金海波   

  1. 1. 辽宁工程技术大学 软件学院, 辽宁 葫芦岛 125105
    2. 辽宁工程技术大学 工商管理学院, 辽宁 葫芦岛 125105

GCN-Pointransformer model for solving the Traveling Salesman Problem

QIU Yunfei, LIU Yifei, YU Zhilong, JIN Haibo   

  1. 1. School of Software, Liaoning Technology University, Huludao, Liaoning 125105, China
    2. School of Business Administration, Liaoning Technology University, Huludao, Liaoning 125105, China

摘要: 由于Transformer模型基于全连接注意力机制,导致在求解经典旅行商问题(Traveling Salesman Problem,TSP)时,计算复杂度较高并且GPU内存使用量过大。针对此问题,提出了一种基于图卷积嵌入层和多头局部自注意力机制的GCN-Pointransformer模型。首先,使用图卷积嵌入方式从输入数据中进行空间特征学习,图卷积嵌入层包含多个可以提取输入数据局部特征的卷积核;其次,使用多头局部自注意力机制MHLSA(Multi Head Local Self Attention,MHLSA),删除冗余信息并提取有用的特征;此外,在编码器中使用可逆残差网络,在反向传播过程中只存储输入和输出嵌入特征对;最后,模型在解码器中增加了Pointer指针层,使用注意力权重作为概率分布,确定要访问的下一个节点。在TSP随机数据集上进行对比实验,优化间隙减少12%、GPU内存减少约11%、推理时间减少约25%,结果表明,该方法优于求解TSP的标准Transformer模型。

关键词: TSP, GCN-Pointransformer, MHLSA, 可逆残差, 指针层

Abstract: Because the Transformer model is based on the fully connected attention mechanism, the computational complexity is high and the GPU memory usage is too large when solving the classic Traveling Salesman Problem. To solve this problem, a GCN-Pointransformer model based on graph convolutional embedding layer and multi-head local self-attention mechanism was proposed. Firstly, the graph convolutional embedding method is used to learn spatial features from the input data, and the graph convolutional embedding layer contains multiple convolution kernels that can extract the local features of the input data, and secondly, the Multi Head Local Self is used Attention, which removes redundant information and extracts useful features, in addition, uses a reversible residual network in the encoder to store only input and output embedding feature pairs during backpropagation, and finally, the model adds a Pointer layer in the decoder, using attention weights as probability distributions to determine the next node to be visited. Comparative experiments on the TSP random datasets show that the optimization gap is reduced by 12%, the GPU memory is reduced by about 11%, and the inference time is reduced by about 25%, and the results show that the proposed method is superior to the standard Transformer model for solving TSP.

Key words: TSP, GCN-Pointransformer, MHLSA, Reversible residual, Pointer layer