This content is not included in your SAE MOBILUS subscription, or you are not logged in.

A Reinforcement Learning Algorithm for Speed Optimization and Optimal Energy Management of Advanced Driver Assistance Systems and Connected Vehicles

Journal Article
02-14-03-0023
ISSN: 1946-391X, e-ISSN: 1946-3928
Published August 25, 2021 by SAE International in United States
A Reinforcement Learning Algorithm for Speed Optimization and Optimal Energy Management of Advanced Driver Assistance Systems and Connected Vehicles
Citation: Shim, Y. and Mollo, C., "A Reinforcement Learning Algorithm for Speed Optimization and Optimal Energy Management of Advanced Driver Assistance Systems and Connected Vehicles," SAE Int. J. Commer. Veh. 14(3):289-302, 2021, https://doi.org/10.4271/02-14-03-0023.
Language: English

Abstract:

This article describes the application of Reinforcement Learning (RL) with an embedded heuristic algorithm to a multi-objective hybrid vehicle optimization. A multi-objective optimization problem (MOP) is defined as a minimization of total energy consumption and trip time resulting from optimal control of vehicle speed over a known route. First, a computationally efficient heuristic optimization algorithm is formulated to solve the MOP for multiple traffic scenarios. Then, the off-line integration of RL is applied to the heuristic optimization algorithm process and utilized to solve the MOP. Finally, the online optimization capability of the machine learning algorithm is discussed, as well as its extension to the vehicle routing problem and the hybrid electric vehicle. The specific scenario investigated is where a generic vehicle begins a trip on a one-lane highway. The length of the highway and the number of vehicles and traffic signals on the road are generic as well. The vehicle must decide on the optimal sequence of speed (or acceleration/deceleration level) for each time step to minimize both the total energy consumption and trip time until it reaches the end of the route. The agent is assumed to be capable of sensing and estimating the relative speed and distance to other vehicles using advanced driver assistance systems (ADAS) sensors or through vehicle-to-vehicle (V2V) communications. Vehicle-to-infrastructure (V2I) communication is assumed for all vehicles so that location of traffic signals and their signal times are known in advance. The constraints in the optimization include speed limits, no traffic signal violation, and no collision with other vehicles. For the three traffic scenarios, the achieved energy savings by the heuristic algorithm is between 11% and 17% compared to the average consumption in kilowatt-hour (kWh) of the design of experiments (DOE) type of traffic simulation.