计算机科学与探索 ›› 2019, Vol. 13 ›› Issue (7): 1239-1250.DOI: 10.3778/j.issn.1673-9418.1806012

• 理论与算法 • 上一篇    下一篇

动态学习机制的双种群蚁群算法

袁汪凰1+,游晓明1,刘  升2   

  1. 1.上海工程技术大学 电子电气工程学院,上海 201600
    2.上海工程技术大学 管理学院,上海 201600
  • 出版日期:2019-07-01 发布日期:2019-07-08

Dual-Population Ant Colony Algorithm on Dynamic Learning Mechanism

YUAN Wanghuang1+, YOU Xiaoming1, LIU Sheng2   

  1. 1.School of Electronic and Electrical Engineering, Shanghai University of Engineering Science, Shanghai 201600, China
    2.School of Management, Shanghai University of Engineering Science, Shanghai 201600, China
  • Online:2019-07-01 Published:2019-07-08

摘要: 针对蚁群算法易陷入局部最优与收敛速度较慢的不足,提出了动态学习机制的双种群蚁群算法。该算法重点引入奖惩模型,奖励算子提高算法的收敛速度,惩罚算子增加种群的多样性。由SA-MMAS(adaptive simulated annealing ant colony algorithm based on max-min ant system)和MMAS(max-min ant system)两个种群合作搜索路径,蚁群间根据不同城市规模动态地进行信息素交流,在种群交流后利用奖惩模型对双种群间的学习合作行为给予动态的反馈,从而平衡算法的多样性与收敛速度。通过17个经典旅行商问题(traveling salesman problem,TSP)实例进行验证,结果表明该算法能以较少的迭代次数取得最优解或接近最优解。对于中大规模的TSP问题效果更好,从而验证了算法的高效性和可行性。

关键词: 动态学习, 奖惩模型, 双种群, 旅行商问题

Abstract: Aiming at the deficiencies of ant colony algorithm that can easily fall into the local optimum and the convergence speed is slow, a dual population ant colony algorithm based on dynamic learning mechanism is proposed. This algorithm focuses on the reward penalty model. The reward operator improves the convergence speed of the algorithm, and the penalty operator improves the diversity of the algorithm. Two populations SA-MMAS (adaptive simulated annealing ant colony algorithm based on max-min ant system) and MMAS (max-min ant system) coopera-tively search paths, and then the ant colonies dynamically communicate pheromone according to different city sizes. After communication between the two colonies, the incentive penalty model is used to give dynamic feedback to the learning cooperative behavior between the two colonies, thus balancing the diversity and convergence speed of the algorithm. Verified by 17 classic TSP (traveling salesman problem) instances, the results show that the algorithm can obtain the optimal solution or near optimal solution with fewer iterations. It is more effective for medium and large-scale TSP, thus verifying the efficiency and feasibility of the algorithm.

Key words: dynamic learning, reward penalty model, dual population, traveling salesman problem (TSP)