This study presents a torque distribution strategy for dual-motor electric
vehicles utilizing a Deep Deterministic Policy Gradient reinforcement learning
algorithm designed to optimize energy consumption. By using a simplified
architecture and replicable reward functions, the proposed agents rely
exclusively on standard CAN bus signals, commanded longitudinal force, and the
motors’ velocities, eliminating the need for specialized sensors or complex
plant models. Two reinforcement agents are trained using two different reward
functions: power-based and State of Charge-based. These agents are validated
through high-fidelity CarSim–Simulink co-simulations across soft, medium, and
severe acceleration scenarios, in which they demonstrate superior performance to
traditional adaptive methods. In the most demanding scenario, a typical adaptive
strategy achieves an additional 7.8% of power consumption and 85% of optimal
energy recovery, while the proposed reinforcement learning strategies reach 0.6%
more consumption and 95% energy recovery during braking compared to the
theoretical optimum. These results highlight a practical, reliable solution for
maximizing efficiency in dual-motor powertrains without significant
computational burden on existing electronic control units.