This content is not included in your SAE MOBILUS subscription, or you are not logged in.

A Combined Markov Chain and Reinforcement Learning Approach for Powertrain-Specific Driving Cycle Generation

Journal Article
2020-01-2185
ISSN: 2641-9645, e-ISSN: 2641-9645
Published September 15, 2020 by SAE International in United States
A Combined Markov Chain and Reinforcement Learning Approach for Powertrain-Specific Driving Cycle Generation
Sector:
Citation: Dietrich, M., Chen, X., and Sarkar, M., "A Combined Markov Chain and Reinforcement Learning Approach for Powertrain-Specific Driving Cycle Generation," SAE Int. J. Adv. & Curr. Prac. in Mobility 3(1):516-527, 2021, https://doi.org/10.4271/2020-01-2185.
Language: English

Abstract:

Driving cycles are valuable tools for emissions calibration at engine and powertrain test beds. While generic velocity profiles were sufficient in the past, legislative changes and increasing complexity of powertrain and exhaust aftertreatment systems require a new approach: Realistically transient cycles - which include critical driving maneuvers and can be tailored to a specific powertrain configuration - are needed to optimize the emission behavior of the said powertrain.
For the generation of realistic velocity profiles, the Markov chain approach has been widely used and described in literature. However, this approach, so far, has only been used to generate cycles that are statistically representative of a large database of real driving trips, which is typically not available during the early stages of development of a new powertrain.
The work at hand combines a Markov chain approach for driving cycle generation with Q-learning - a reinforcement learning algorithm - to generate driving cycles that meet user-defined criteria and include critical driving maneuvers reflecting individual powertrain development targets.
Using simplified models of the powertrain, vehicle and driver and an iterative process, the generated driving cycles are evaluated regarding the fulfillment of the set criteria. The learning algorithm then adapts the Markov chain transition probability matrix to increase the occurrence of the desired critical driving maneuvers and reduce the share of undesired stochastic elements.
Exemplary results show that the proposed method is indeed able to generate driving cycles with certain realistic properties that include the desired maneuvers with the expected statistical variance.