TY - GEN
T1 - Toward Obstacle Avoidance for Mobile Robots Using Deep Reinforcement Learning Algorithm
AU - Gao, Xiaoshan
AU - Yan, Liang
AU - Wang, Gang
AU - Wang, Tiantian
AU - Du, Nannan
AU - Gerada, Chris
N1 - Publisher Copyright:
© 2021 IEEE.
PY - 2021/8/1
Y1 - 2021/8/1
N2 - The state-of-the-art deep reinforcement learning algorithm, i.e., the deep deterministic policy gradient (DDPG), has achieved good performance in continuous control problems for the robotics. However, the conventional experience replay mechanism of the DDPG algorithm stores the experience explored by the mobile robot in the bufer pool, and trains the neural network through random sampling, without considering whether the transition is valuable, which can probably influence the network performance. To overcome the limitation, the DDPG framework with separating experience is developed for mobile robot collision-free navigation in this study, to replay the transitions of valuable and the failed experience discretely. Additionally, environment state vector is designed including mobile robot and obstacles, the reward function and action space are also designed. The simulation results show that the proposed model can possess the collision-free navigation capacity to deal with multiple obstacles.
AB - The state-of-the-art deep reinforcement learning algorithm, i.e., the deep deterministic policy gradient (DDPG), has achieved good performance in continuous control problems for the robotics. However, the conventional experience replay mechanism of the DDPG algorithm stores the experience explored by the mobile robot in the bufer pool, and trains the neural network through random sampling, without considering whether the transition is valuable, which can probably influence the network performance. To overcome the limitation, the DDPG framework with separating experience is developed for mobile robot collision-free navigation in this study, to replay the transitions of valuable and the failed experience discretely. Additionally, environment state vector is designed including mobile robot and obstacles, the reward function and action space are also designed. The simulation results show that the proposed model can possess the collision-free navigation capacity to deal with multiple obstacles.
KW - deep deterministic policy gradient
KW - mobile robot
KW - obstacle avoidance
UR - http://www.scopus.com/inward/record.url?scp=85115445208&partnerID=8YFLogxK
U2 - 10.1109/ICIEA51954.2021.9516114
DO - 10.1109/ICIEA51954.2021.9516114
M3 - Conference contribution
AN - SCOPUS:85115445208
T3 - Proceedings of the 16th IEEE Conference on Industrial Electronics and Applications, ICIEA 2021
SP - 2136
EP - 2139
BT - Proceedings of the 16th IEEE Conference on Industrial Electronics and Applications, ICIEA 2021
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 16th IEEE Conference on Industrial Electronics and Applications, ICIEA 2021
Y2 - 1 August 2021 through 4 August 2021
ER -