计算机科学与探索 ›› 2022, Vol. 16 ›› Issue (9): 2151-2162.DOI: 10.3778/j.issn.1673-9418.2102070

• 理论与算法 • 上一篇    下一篇

融合经验共享Q学习的粒子群优化算法

罗逸轩1,2, 刘建华1,2,+(), 胡任远1,2, 张冬阳1,2, 卜冠南1,2   

  1. 1.福建工程学院 计算机科学与数学学院,福州 350118
    2.福建省大数据挖掘与应用技术重点实验室,福州 350118
  • 收稿日期:2021-03-01 修回日期:2021-05-08 出版日期:2022-09-01 发布日期:2021-05-18
  • 通讯作者: + E-mail: jhliu@fjnu.edu.cn
  • 作者简介:罗逸轩(1996—),男,福建福州人,硕士研究生,CCF会员,主要研究方向为智能计算、强化学习。
    刘建华(1967—),男,江西吉安人,博士,教授,CCF会员,主要研究方向为智能计算、大数据分析、物联网技术。
    胡任远(1997—),男,硕士研究生,CCF会员,主要研究方向为自然语言处理、深度学习。
    张冬阳(1994—),女,江苏淮安人,硕士研究生,主要研究方向为智能计算、机器学习。
    卜冠南(1994—),男,安徽阜阳人,硕士研究生,CCF会员,主要研究方向为智能计算、机器学习。
  • 基金资助:
    福建省自然科学基金(2019J01061137);福建工程学院发展基金(GY-Z17150)

Particle Swarm Optimization Combined with Q-learning of Experience Sharing Strategy

LUO Yixuan1,2, LIU Jianhua1,2,+(), HU Renyuan1,2, ZHANG Dongyang1,2, BU Guannan1,2   

  1. 1. College of Computer Science and Mathematics, Fujian University of Technology, Fuzhou 350118, China
    2. Fujian Provincial Key Laboratory of Big Data Mining and Applications, Fuzhou 350118, China
  • Received:2021-03-01 Revised:2021-05-08 Online:2022-09-01 Published:2021-05-18
  • About author:LUO Yixuan, born in 1996, M.S. candidate,member of CCF. His research interests include computational intelligence and reinforcement learning.
    LIU Jianhua, born in 1967, Ph.D., professor,member of CCF. His research interests include computational intelligence, big data analysis and IoT technology.
    HU Renyuan, born in 1997, M.S. candidate,member of CCF. His research interests include natural language processing and deep learning.
    ZHANG Dongyang, born in 1994, M.S. candidate. Her research interests include computational intelligence and machine learning.
    BU Guannan, born in 1994, M.S. candidate,member of CCF. His research interests include computational intelligence and machine learning.
  • Supported by:
    Natural Science Foundation of Fujian Province(2019J01061137);Development Foundation for Fujian University of Technology(GY-Z17150)

摘要:

传统粒子群优化算法(PSO)有着易陷入局部最优、多样性不足和精度低等缺点。近年来,采用强化学习的Q学习思想改进粒子群算法成为一种新的方法,然而目前这种方法存在参数选择偏主观和使用策略单一使其无法解决复杂情况的问题。提出一种融合经验共享策略Q学习的粒子群优化算法(QLPSOES)。该算法将粒子群算法与Q学习方法结合,对每个粒子构建一张Q表,供粒子参数动态选择;同时设计了一种经验共享策略,即粒子通过Q表共享最优粒子的“行为经验”,加速Q表的收敛,增强粒子之间的学习能力,平衡算法的全局和局部搜索能力。另外,采用正交分析法实验,寻找融合Q学习粒子群算法的状态、动作参数和奖励函数等参数的最优组合;最后通过CEC2013中的基准测试函数的实验测试,结果表明,融合经验共享Q学习的粒子群算法的收敛速度和收敛精度相对给出的对比算法均有明显提升,验证了算法具有较优的性能。

关键词: 粒子群算法(PSO), 强化学习, 经验共享策略, Q表, 正交实验

Abstract:

Particle swarm optimization (PSO) has shortcomings such as easy to fall into local optimum, insufficient diversity and low precision. Recently, adopting the strategy of combining the reinforcement learning method like Q-learning to improve the PSO algorithm has become a new idea. However, this method has been proven to suffer the insufficient objectiveness of parameter selection and the limited strategy is not capable of coping with various situations. This paper proposes a Q-learning PSO with experience sharing (QLPSOES). The algorithm combines the PSO algorithm with the reinforcement learning method to construct a Q-table for each particle for dynamic selection of particle parameter settings. At the same time, an experience sharing strategy is designed, in which the particles share the “behavior experience” of the optimal particle through the Q-table. This method can accelerate the convergence of Q-table, enhance the learning ability between particles, and balance the global and local search ability of the algorithm. In addition, this paper uses orthogonal analysis experiments to find reinforcement learning methods for the selection of state, action parameters and reward functions in the PSO algorithm. The experiment is tested on the CEC2013 test function. The results show that the convergence speed and convergence accuracy of the QLPSOES algorithm are significantly improved compared with other algorithms, which verifies that the algorithm has better performance.

Key words: particle swarm optimization (PSO), reinforcement learning, experience sharing strategy, Q-table, orthogonal experiment

中图分类号: