Autopilot Strategy Based on Improved DDPG Algorithm

2019-01-5072

11/04/2019

Features
Event
New Energy & Intelligent Connected Vehicle Technology Conference
Authors Abstract
Content
Deep Deterministic Policy Gradient (DDPG) is one of the Deep Reinforcement Learning algorithms. Because of the well perform in continuous motion control, DDPG algorithm is applied in the field of self-driving. Regarding the problems of the instability of DDPG algorithm during training and low training efficiency and slow convergence rate. An improved DDPG algorithm based on segmented experience replay is presented. On the basis of the DDPG algorithm, the segmented experience replay select the training experience by the importance according to the training progress to improve the training efficiency and stability of the training model. The algorithm was tested in an open source 3D car racing simulator called TORCS. The simulation results demonstrate the training stability is significantly improved compared with the DDPG algorithm and the DQN algorithm, and the average return is about 46% higher than the DDPG algorithm and about 55% higher than the DQN algorithm.
Meta TagsDetails
DOI
https://doi.org/10.4271/2019-01-5072
Pages
6
Citation
Tian, Z., Zuo, X., and Li, X., "Autopilot Strategy Based on Improved DDPG Algorithm," SAE Technical Paper 2019-01-5072, 2019, https://doi.org/10.4271/2019-01-5072.
Additional Details
Publisher
Published
Nov 4, 2019
Product Code
2019-01-5072
Content Type
Technical Paper
Language
English