陈满, 李茂军, 李宜伟, 赖志强. 基于深度强化学习和人工势场法的移动机器人导航[J]. 云南大学学报(自然科学版), 2021, 43(6): 1125-1133. doi: 10.7540/j.ynu.20210091
引用本文: 陈满, 李茂军, 李宜伟, 赖志强. 基于深度强化学习和人工势场法的移动机器人导航[J]. 云南大学学报(自然科学版), 2021, 43(6): 1125-1133. doi: 10.7540/j.ynu.20210091
CHEN Man, LI Mao-jun, LI Yi-wei, LAI Zhi-qiang. Mobile robot navigation based on deep reinforcement learning and artificial potential field method[J]. Journal of Yunnan University: Natural Sciences Edition, 2021, 43(6): 1125-1133. DOI: 10.7540/j.ynu.20210091
Citation: CHEN Man, LI Mao-jun, LI Yi-wei, LAI Zhi-qiang. Mobile robot navigation based on deep reinforcement learning and artificial potential field method[J]. Journal of Yunnan University: Natural Sciences Edition, 2021, 43(6): 1125-1133. DOI: 10.7540/j.ynu.20210091

基于深度强化学习和人工势场法的移动机器人导航

Mobile robot navigation based on deep reinforcement learning and artificial potential field method

  • 摘要: 针对移动机器人在公共服务领域导航任务中的深度强化学习算法所面临的状态信息交互困难、回馈机制不充分和动作探索冗余等问题,提出势场增强注意力深度强化学习PARL算法. 首先,利用人工势场法和注意力机制设计势场注意力网络;然后,利用人工势能场理论构建一种全新的势场奖励函数;最后,提出反向近似模型,并结合势场奖励函数的空间划分方式改进动作空间. 实验结果表示,使用PARL算法驱动的机器人,自主学习效率得到提高,平均导航成功率和安全率分别为100%和98.2%,与SARL、CADRL、ORCA算法相比,平均导航时间缩短0.14~1.11 s,且导航动作的鲁棒性更强.

     

    Abstract: Aiming at the difficulties of state information interaction, insufficient feedback mechanism and redundant action exploration faced by the deep reinforcement learning algorithm of mobile robots in the navigation task of public services, PARL algorithm is proposed. First of all, we use the artificial potential field method and attention mechanism to design a potential field attention network. Then we use artificial potential field theory to construct a new potential field reward function. Finally, we propose a reverse approximation model. The model combines the space division method of the potential field reward function to improve the action space. The experimental results show that the use of the mobile robot driven by the PARL algorithm improves the efficiency of autonomous learning. Compared with SARL, CADRL, DRCA algorithms, the average navigation success rate and safety rate are 100% and 98.2%, respectively. The average navigation time is shortened by 0.14~1.11 s and the navigation action is more robust.

     

/

返回文章
返回