计算机科学与探索 ›› 2019, Vol. 13 ›› Issue (4): 693-701.DOI: 10.3778/j.issn.1673-9418.1802002

• 理论与算法 • 上一篇    下一篇

融合改进强化学习的认知无线电抗干扰决策算法

朱  芮,马永涛+,南亚飞,张云蕾   

  1. 天津大学 微电子学院,天津 300072
  • 出版日期:2019-04-01 发布日期:2019-04-10

Cognitive Radio Anti-Jamming Decision Algorithm Based on Improved Reinforcement Learning

ZHU Rui, MA Yongtao+, NAN Yafei, ZHANG Yunlei   

  1. School of Microelectronics, Tianjin University, Tianjin 300072, China
  • Online:2019-04-01 Published:2019-04-10

摘要: 针对认知无线电环境中认知用户易受到干扰的问题,研究了具有跳频功能的认知用户与智能感知功能的干扰器之间的相互作用。为了充分利用无线电频谱资源,在综合考虑信道选择和功率分配的基础上,设计了以认知用户的频谱能效性能为参考标准的效用函数,并将改进强化学习算法融入认知学习决策引擎中。决策算法通过将认知环境与决策引擎的交互建模为强化学习中环境与智能体的交互,探索最大的动作奖励反馈给认知决策引擎,在交互过程中得到自适应的优化策略选择。仿真结果表明,提出的算法能够较快速地收敛,选择的策略能够有效地优化认知用户在干扰情况下的性能,比随机策略的性能提高50%以上。

关键词: 认知无线电, 强化学习, 功率分配, 信道选择, 抗干扰

Abstract: For the problem that the cognitive users are easily jammed in cognitive radio systems, this paper investigates a novel algorithm based on the interaction between a cognitive user with frequency hopping and a smart jammer. To fully utilize the limited spectrum resources, comprehensively considering channel selection and power allocation, this paper designs the utility function of the spectral energy efficiency of the cognitive users as the reference standard and integrates the improved reinforcement learning algorithm into the cognitive decision engine. By transforming the environment and agent model of reinforcement learning into the interaction between the cognitive environment and the decision engine, the decision algorithm explores the maximum return of action feedback to the cognitive user and finally obtains the adaptive optimization strategy. Simulation results show that the proposed algorithm can converge faster than the traditional one and the adaptive strategy can effectively improve the secondary user??s performance against smart jammers which is more than 50% higher than the random strategy.

Key words: cognitive radio, reinforcement learning, power allocation, channel selection, anti-jamming