Methodology for Developing a Learning-Based Energy Management Strategy for Hybrid Electric Vehicles via Soft Actor-Critic (SAC) Deep Reinforcement Learning Algorithm

Content: The optimization of energy management strategies for hybrid electric vehicles is crucial for minimizing fuel and electrical energy consumption while maintaining the energetic stability of the electrical system. Conventional heuristic, rule-based approaches typically rely on classical optimization techniques and manual calibration by experienced engineers. These methods often suffer from simplified assumptions, sub-optimality, and are increasingly time-consuming given the growing complexity of modern hybrid powertrain architectures. This research proposes a novel methodology for the development of a learning-based energy management strategy (EMS) via deep reinforcement learning (DRL) to transition toward highly automated, data-based, and optimization-based development approaches. The methodology utilizes the Soft Actor-Critic (SAC) algorithm, an off-policy actor-critic method, to train an agent through experiences by interacting with an environment. The environment consists of a backward-looking, quasi-static vehicle longitudinal dynamics simulation model of an exemplary P2 plug-in hybrid electric vehicle (PHEV) combined with a database of customer-representative driving profiles. The agent learns optimal control policies through defined states and actions, optimizing a multi-criteria reward function that balances fuel efficiency against energetic stability. The framework permits the definition of both non-predictive and predictive states. Additionally, a shield function is implemented to consider hard constraints ensuring safe and stable operation. Variation calculations and sensitivity analyses regarding reward function shaping and hyperparameter tuning are conducted. The agent is trained in an offline simulation environment, and the learned policy of the trained deep neural network (DNN) is transferred into deterministic control maps, applicable to vehicle control units, ensuring interpretability, reproducibility, and compliance with certification requirements. Finally, exemplary simulation results of the DRL-EMS approach are presented and compared to benchmark equivalent consumption minimization strategy (ECMS). In conclusion, the proposed methodology enables a generally valid approach for the development of learning-based energy management strategies towards close-to-optimal strategies while reducing manual calibration effort.

Meta Tags

Topics: Hybrid electric vehicles
Energy consumption
Neural networks
Energy management

Affiliated or Co-Author: Mercedes-Benz AG
Vienna University of Tech.

Details

DOI: https://doi.org/10.4271/2026-01-0774

Citation: Metzler, S., Winke, F., Jungen, M., Schmiedler, S., et al., "Methodology for Developing a Learning-Based Energy Management Strategy for Hybrid Electric Vehicles via Soft Actor-Critic (SAC) Deep Reinforcement Learning Algorithm," 2026 Stuttgart International Symposium, Stuttgart, Germany, July 8, 2026, https://doi.org/10.4271/2026-01-0774.

Additional Details

Publisher: SAE International

Published: Jul 01

Product Code: 2026-01-0774

Content Type: Technical Paper

Language: English