Studies on Drivers’ Driving Styles Based on Inverse Reinforcement Learning

Yuande Jiang; Weiwen Deng; Jinsong Wang; Bing Zhu

doi:10.4271/2018-01-0612

Although advanced driver assistance systems (ADAS) have been widely introduced in automotive industry to enhance driving safety and comfort, and to reduce drivers’ driving burden, they do not in general reflect different drivers’ driving styles or customized with individual personalities. This can be important to comfort and enjoyable driving experience, and to improved market acceptance. However, it is challenging to understand and further identify drivers’ driving styles due to large number and great variations of driving population. Previous research has mainly adopted physical approaches in modeling drivers’ driving behavior, which however are often very much limited, if not impossible, in capturing human drivers’ driving characteristics. This paper proposes a reinforcement learning based approach, in which the driving styles are formulated through drivers’ learning processes from interaction with surrounding environment. Based on the reinforcement learning theory, driving action can be treated as maximizing a reward function. Instead of calibrating the unknown reward function to satisfy driver’s desired response, we try to recover it from the human driving data, utilizing maximum likelihood inverse reinforcement learning (MLIRL). An IRL-based longitudinal driving assistance system is also proposed in this paper. Firstly, large amount of real world driving data is collected from a test vehicle, and the data is split into two sets for training and for testing purposes respectively. Then, the longitudinal acceleration is modeled as a Boltzmann distribution in human driving activity. The reward function is denoted as a linear combination of some kernelized basis functions. The driving style parameter vector is estimated using MLIRL based on the training set. Finally, a learning-based longitudinal driving assistance algorithm is developed and evaluated on the testing set. The results demonstrate that the proposed method can satisfactorily reflect human drivers’ driving behavior.