2019-01-5072: Autopilot Strategy Based on Improved DDPG Algorithm - Technical Paper

Features

Event: New Energy & Intelligent Connected Vehicle Technology Conference

Authors

Hubei Province Key Laboratory of Modern Automotive Technolog

Wuhan University of Technology

Hubei Province Key Laboratory of Modern Automotive Technolog

Abstract

Content: Deep Deterministic Policy Gradient (DDPG) is one of the Deep Reinforcement Learning algorithms. Because of the well perform in continuous motion control, DDPG algorithm is applied in the field of self-driving. Regarding the problems of the instability of DDPG algorithm during training and low training efficiency and slow convergence rate. An improved DDPG algorithm based on segmented experience replay is presented. On the basis of the DDPG algorithm, the segmented experience replay select the training experience by the importance according to the training progress to improve the training efficiency and stability of the training model. The algorithm was tested in an open source 3D car racing simulator called TORCS. The simulation results demonstrate the training stability is significantly improved compared with the DDPG algorithm and the DQN algorithm, and the average return is about 46% higher than the DDPG algorithm and about 55% higher than the DQN algorithm.

Meta Tags

Topics: Automated vehicles
Machine learning
Driving automation
Simulators
Simulation and modeling

Affiliated or Co-Author: Hubei Province Key Laboratory of Modern Automotive Technolog
Wuhan University of Technology

Details

DOI: https://doi.org/10.4271/2019-01-5072

Pages: 6

Citation: Tian, Z., Zuo, X., and Li, X., "Autopilot Strategy Based on Improved DDPG Algorithm," SAE Technical Paper 2019-01-5072, 2019, https://doi.org/10.4271/2019-01-5072.

Additional Details

Publisher: SAE International

Published: Nov 4, 2019

Product Code: 2019-01-5072

Content Type: Technical Paper

Language: English