Methodology for Developing a Learning-Based Energy Management Strategy for Hybrid Electric Vehicles via Soft Actor-Critic (SAC) Deep Reinforcement Learning Algorithm
2026-01-0774
To be published on 06/01/2026
- Content
- The optimization of energy management strategies for hybrid electric vehicles is crucial for minimizing fuel and electrical energy consumption while maintaining the energetic stability of the electrical system. Conventional heuristic, rule-based approaches typically rely on classical optimization techniques and manual calibration by experienced engineers. These methods often suffer from simplified assumptions, sub-optimality, and are increasingly time-consuming given the growing complexity of modern hybrid powertrain architectures. This research proposes a novel methodology for the development of a learning energy management strategy based on deep reinforcement learning (DRL) to transition toward highly automated, data-based, and optimization-based development approaches. The methodology utilizes the Soft Actor-Critic (SAC) algorithm, an off-policy actor-critic method, to train an agent through experiences by interacting with an environment. The environment consists of a backward-looking, quasi-static vehicle longitudinal dynamics simulation model of an exemplary P2 plug-in hybrid electric vehicle (PHEV) combined with a database of customer-representative driving profiles. The agent learns optimal control policies through defined states and actions, optimizing a multi-criteria reward function that balances fuel efficiency against energetic stability. The framework permits the definition of both non-predictive and predictive states. Additionally, a shield function is implemented to consider hard constraints ensuring safe and stable operation. Comprehensive variation calculations and sensitivity analyses regarding reward function shaping and hyperparameter tuning are conducted. The agent is trained in an offline simulation environment, and the learned policy of the trained deep neural network (DNN) is transferred into deterministic control maps, applicable to vehicle control units, ensuring interpretability, reproducibility, and compliance with certification requirements. Finally, exemplary simulation results of the novel energy management strategy approach are presented and compared to benchmark equivalent consumption minimization strategy (ECMS). In conclusion, the proposed novel methodology enables a generally valid approach for the development of learning energy management strategies towards close-to-optimal strategies while reducing manual calibration effort.
- Citation
- Metzler, S., Winke, F., Jungen, M., Schmiedler, S., et al., "Methodology for Developing a Learning-Based Energy Management Strategy for Hybrid Electric Vehicles via Soft Actor-Critic (SAC) Deep Reinforcement Learning Algorithm," 2026 Stuttgart International Symposium, Stuttgart, Germany, July 8, 2026, .