基于多臂赌博机遗传算法的无人机与卡车协同配送

doi:10.3778/j.issn.1673-9418.2410063

摘要/Abstract

摘要： 无人机与卡车协同配送新模式凭借其高效、环保、不受地形限制等优势，正在改变传统的物流配送方式。带无人机的旅行商问题（TSP-D）是上述配送新模式中的一种经典问题，比纯卡车物流配送更为复杂，需要从无人机和卡车间的协同交互中寻找最优的配送组合，带来了新的挑战。提出了一种基于多臂赌博机的混合遗传算法来求解TSP-D。采用了自然数排列的染色体编码，并应用基于动态规划的精确划分方法对其解码，以生成无人机与卡车协同配送解方案。新设计了一种多臂赌博机局部搜索策略,将局部搜索算子池中的五种不同搜索算子视作赌博机的多个“臂”。先通过赌博机摇臂搜索后解方案适应值的提升程度来计算奖励，再根据[ε]-greedy强化学习方法计算各个“臂”被选中的概率，以便选择合适的搜索算子来增强算法的局部搜索能力。实验结果表明，提出的算法与其他主流的算法相比，在不同分布与不同规模的多数测试实例上均有更低的解方案成本。进一步的实验分析验证了多臂赌博机局部搜索策略比其他局部搜索策略具有更好的自适应能力，能显著提升算法的性能。最后，将提出的算法应用于长沙市一个实际的配送案例，展示了其现实应用效果。

关键词: 无人机卡车协同配送, 带无人机的旅行商问题, 混合遗传算法, 多臂赌博机

Abstract: The new mode of drone-truck collaborative delivery is changing the traditional logistics distribution mode with its advantages of high efficiency, environmental protection, and no terrain restrictions. The traveling salesman problem with drone (TSP-D) is a classic problem in the above-mentioned distribution mode. It is more complicated than truck-only logistics distribution and requires finding the optimal distribution combination from the collaborative interaction between drone and truck, which brings new challenges. A hybrid genetic algorithm based on multi-armed bandit (HGA-MAB) is proposed to solve TSP-D. Natural number permutation is used in the chromosome encoding, and an exact partitioning method based on dynamic programming is applied to decoding the chromosome to generate a collaborative delivery solution of drone and truck. A new local search strategy based on multi-armed bandit is designed, which regards the five different search operators in the local search operator pool as multiple “arms” of the bandit. The reward is first calculated by the improvement of the solution fitness after the bandit arm search, and then the probability of each “arm” being selected is calculated according to the [ε]-greedy reinforcement learning method, so as to select the appropriate operator to improve the local search ability of the algorithm. Experimental results show that the proposed algorithm is able to find low costs compared with the state-of-the-art algorithms on most test instances with different distributions and scales. Further experimental analysis indicates that the multi-armed bandit local search strategy has better adaptability than other strategies and can significantly improve the performance of the algorithm. Finally, the proposed algorithm is applied on a real-world delivery case of Changsha, which shows its practical application effect.

Key words: drone-truck collaborative delivery, traveling salesman problem with drone, hybrid genetic algorithm, multi-armed bandit

朱烨娜, 刘敏, 赵肄江, 陈萱霖. 基于多臂赌博机遗传算法的无人机与卡车协同配送[J]. 计算机科学与探索, 2025, 19(8): 2261-2272.

ZHU Yena, LIU Min, ZHAO Yijiang, CHEN Xuanlin. Genetic Algorithm Based on Multi-armed Bandit for Drone-Truck Collaborative Delivery[J]. Journal of Frontiers of Computer Science and Technology, 2025, 19(8): 2261-2272.

参考文献

[1] AGATZ N, BOUMAN P, SCHMIDT M. Optimization approaches for the traveling salesman problem with drone[J]. Transportation Science, 2018, 52(4): 965-981.
[2] BOGYRBAYEVA A, YOON T, KO H, et al. A deep reinforcement learning approach for solving the traveling salesman problem with drone[J]. Transportation Research Part C: Emerging Technologies, 2023, 148: 103981.
[3] HA Q M, DEVILLE Y, PHAM Q D, et al. A hybrid genetic algorithm for the traveling salesman problem with drone[J]. Journal of Heuristics, 2020, 26: 219-247.
[4] MAHMOUDINAZLOU S, KWON C. A hybrid genetic algorithm with type-aware chromosomes for traveling salesman problems with drone[J]. European Journal of Operational Research, 2024, 318: 719-739.
[5] 周雅兰, 廖易天, 粟筱, 等. 深度强化学习Memetic算法求解取送货车辆路径问题[J]. 计算机科学与探索, 2024, 18(3): 818-830.
ZHOU Y L, LIAO Y T, SU X, et al. Memetic algorithm based on deep reinforcement learning for vehicle routing problem with pickup-delivery[J]. Journal of Frontiers of Computer Science and Technology, 2024, 18(3): 818-830.
[6] BOUMAN P, AGATZ N, SCHMIDT M. Dynamic programming approaches for the traveling salesman problem with drone[J]. Networks, 2018, 72(4): 528-542.
[7] POIKONEN S, GOLDEN B, WASIL E A. A branch-and-bound approach to the traveling salesman problem with a drone[J]. INFORMS Journal on Computing, 2019, 31(2): 335-346.
[8] SCHERMER D, MOEINI M, WENDT O. A branch-and-cut approach and alternative formulations for the traveling salesman problem with drone[J]. Networks, 2020, 76(2): 164-186.
[9] TINI? G O, KARASAN O E, KARA B Y, et al. Exact solution approaches for the minimum total cost traveling salesman problem with multiple drones[J]. Transportation Research Part B: Methodological, 2023, 168: 81-123.
[10] MURRAY C C, CHU A G. The flying sidekick traveling salesman problem: optimization of drone-assisted parcel delivery[J]. Transportation Research Part C: Emerging Technologies, 2015, 54: 86-109.
[11] HA Q M, DEVILLE Y, PHAM Q D, et al. On the min-cost traveling salesman problem with drone[J]. Transportation Research Part C: Emerging Technologies, 2018, 86: 597-621.
[12] LUO Q, WU G, JI B, et al. Hybrid multi-objective optimization approach with Pareto local search for collaborative truck-drone routing problems considering flexible time windows[J]. IEEE Transactions on Intelligent Transportation Systems, 2022, 23(8): 13011-13025.
[13] MURRAY C C, RAJ R. The multiple flying sidekicks traveling salesman problem: parcel delivery with multiple drones[J]. Transportation Research Part C: Emerging Technologies, 2020, 110: 368-398.
[14] WANG X Y, POIKONEN S, GOLDEN B. The vehicle routing problem with drones: several worst-case results[J]. Optimization Letters, 2017, 11(4): 679-697.
[15] SCHERMER D, MOEINI M, WENDT O. A matheuristic for the vehicle routing problem with drones and its variants[J]. Transportation Research Part C: Emerging Technologies, 2019, 106: 166-204.
[16] 任璇, 黄辉, 于少伟, 等. 车辆与无人机组合配送研究综述[J]. 控制与决策, 2021, 36(10): 2313-2327.
REN X, HUANG H, YU S W, et al. Review on vehicle-UAV combined delivery problem[J]. Control and Decision, 2021, 36(10): 2313-2327.
[17] APPLEGATE D, BIXBY R, CHVáTAL V, et al. TSP cuts which do not conform to the template paradigm[M]//Computational combinatorial optimization. Berlin, Heidelberg: Springer, 2001: 261-303.
[18] ZHAI R N, MEI Y, GUO T, et al. A collaborative drone-truck delivery system with memetic computing optimization[J]. IEEE Transactions on Systems, Man, and Cybernetics: Systems, 2024, 54(6): 3618-3630.
[19] MASMOUDI M A, MANCINI S, BALDACCI R, et al. Vehicle routing problems with drones equipped with multi-package payload compartments[J]. Transportation Research Part E: Logistics and Transportation Review, 2022, 164: 102757.
[20] SACRAMENTO D, PISINGER D, ROPKE S. An adaptive large neighborhood search metaheuristic for the vehicle routing problem with drones[J]. Transportation Research Part C: Emerging Technologies, 2019, 102: 289-315.
[21] 伍国华, 毛妮, 徐彬杰, 等. 基于自适应大规模邻域搜索算法的多车辆与多无人机协同配送方法[J]. 控制与决策, 2023, 38(1): 201-210.
WU G H, MAO N, XU B J, et al. The cooperative delivery of multiple vehicles and multiple drones based on adaptive large neighborhood search[J]. Control and Decision, 2023, 38(1): 201-210.
[22] 吴廷映, 陶新月, 孟婷. “卡车+无人机”模式下带时间窗的取送货车辆路径问题[J]. 计算机集成制造系统, 2023, 29(7): 2440-2448.
WU T Y, TAO X Y, MENG T. Pickup and delivery problem with time windows in mode of “truck +drone”[J]. Computer Integrated Manufacturing Systems, 2023, 29(7): 2440-2448.
[23] DAS D N, SEWANI R, WANG J, et al. Synchronized truck and drone routing in package delivery logistics[J]. IEEE Transactions on Intelligent Transportation Systems, 2020, 22(9): 5772-5782.
[24] WU G, FAN M, SHI J, et al. Reinforcement learning based truck-and-drone coordinated delivery[J]. IEEE Transactions on Artificial Intelligence, 2021, 4(4): 754-763.
[25] CHEN X, ULMER M W, THOMAS B W. Deep Q-learning for same-day delivery with vehicles and drones[J]. European Journal of Operational Research, 2022, 298(3): 939-952.
[26] HELSGAUN K. An effective implementation of the Lin-Kernighan traveling salesman heuristic[J]. European Journal of Operational Research, 2000, 126(1): 106-130.
[27] SHARMA M K, CHAUDHARY S, RATHOUR L, et al. Modified genetic algorithm with novel crossover and mutation operator for travelling salesman problem[J]. Sigma Journal of Engineering and Natural Sciences, 2024, 42(6): 1876-1883.
[28] KATOCH S, CHAUHAN S S, KUMAR V. A review on genetic algorithm: past, present, and future[J]. Multimedia Tools and Applications, 2021, 80(5): 8091-8126.
[29] 周志华. 机器学习[M]. 北京: 清华大学出版社，2016: 373-376.
ZHOU Z H. Machine learning[M]. Beijing: Tsinghua University Press, 2016: 373-376.
[30] CROES G A. A method for solving traveling-salesman problems[J]. Operations Research, 1958, 6(6): 791-812.
[31] EVEN S, ITAI A, SHAMIR A. On the complexity of time table and multi-commodity flow problems[C]//Proceedings of the 16th Annual Symposium on Foundations of Computer Science. Piscataway: IEEE, 1975: 184-193.
[32] LIN S, KERNIGHAN B W. An effective heuristic algorithm for the traveling-salesman problem[J]. Operations Research, 1973, 21(2): 498-516.