This content is not included in your SAE MOBILUS subscription, or you are not logged in.
Computation of Driving Pleasure based on Driver's Learning Process Simulation by Reinforcement Learning
ISSN: 0148-7191, e-ISSN: 2688-3627
Published March 25, 2013 by SAE International in United States
Annotation ability available
In order to improve the driver's experiences such as driving pleasure, it is important to evaluate the relationship between various vehicle characteristics and the driver's feeling. Although methods such as sensory subjective evaluation are commonly used, the mechanism behind them is not yet fully understood.
In this paper we introduce a novel method for evaluating driving pleasure based on the numerical simulation of the driver's learning process. As an example of this method we evaluate the relationship between mechanical property of steering system and pleasure felt during the driver's learning process.
One possible method to simulate the driver's learning process is machine learning. Reinforcement learning has been studied for simulating the human's brain function to learn . We use machine learning to create the reinforcement learning driver model, and a simple vehicle simulation model which are combined as a human-vehicle model. Then the model, with four different settings of steering stiffness, is simulated to learn to drive on a winding road constructed with two curves. The result shows that the characteristics of driver's learning process depend on the steering stiffness. We also find that there is a trade-off between the learning speed at the beginning and the learned level at the end of the learning process. So we estimate there is an optimal steering stiffness for continuous progress while learning how to drive, with which the driver can feel a high sense of accomplishment.
The aim of this research is to investigate whether the driver's progress process can be simulated or not. So in this study, we used the simple vehicle and driver model. We will continue to develop more precise models of both vehicle and driver to unearth the mechanisms of driving pleasure.
CitationSakuma, T., Shimizu, T., Miki, Y., Doya, K. et al., "Computation of Driving Pleasure based on Driver's Learning Process Simulation by Reinforcement Learning," SAE Technical Paper 2013-01-0056, 2013, https://doi.org/10.4271/2013-01-0056.
- Doya, K. (1996). Temporal difference learning in continuous time and space. In Touretzky D. S., Mozer M. C., & Hasselmo M. E. (Eds.), Advances in neural information processing systems, 8 (pp. 1073-1079). Cambridge, MA: MIT Press.
- Kuge, N., Yamamura, T., Shimoyama, O., and Liu, A., “A Driver Behavior Recognition Method Based on a Driver Model Framework,” SAE Technical Paper 2000-01-0349, 2000, doi: 10.4271/2000-01-0349.
- Koike Yasuharu, Doya Kenji. Multiple state estimation reinforcement learning for driving model -Driver model of Automobile -, IEEE International Conference on Systems, Man, and Cybernetics, Vol. V, pp. 504-509, 1999.
- Mizuma, H., Kuriki, K. et al., “Study for the improvement of operation feeling drive control system,” presented at JSAE Conference 2004, JAPAN, May 2004
- Sakuma, T., Abe, M., Sato, H., Shimoyama, O., “Characteristics Evaluation of the Variable Gear Ratio Steering system based on the Human's Sensory Characteristics,” presented at JSAE Conference 2006, JAPAN, May 2006
- Abe, M., Sakuma, T., Sato, H., Shimoyama, O., “Manufacturing Experimental Vehicle for Measuring Driver's Steering Characteristics,” presented at JSAE Conference 2006, JAPAN, May 2006
- Shimizu, T., Hirose, S., Tsunashima, H., et al., “Measurement of Frontal Cortex Brain Activity Attributable to the Driving Workload and Increased Attention,” SAE International Journal of Passenger Cars-Mechanical Systems October 2009 vol. 2 no. 1 736-744, doi: 10.4271/2009-01-0545.
- Schultz W, Apicella P, Ljungberg T: Responses of monkey dopamine neurons to reward and conditioned stimuli during successive steps of learning a delayed response task. J Neurosci 1993, 13:900-913.
- Sheridan, T.B. “Three models of preview control”. IEEE Trans. on Human Factors in Electronics, HFE, 7(2), pp 91-102 (1966).
- Rusmussen, J. Skills, Rules, and Knowledge; Signals, Signs, and Symbols, and Other distinctions in Human Performance Models: IEEE Trans. Syst. Man & Cybern., 13-3(1983), 257
- Barto, A. G., Sutton, R. S., & Anderson, C. W. (1983). Neuronlike adaptive elements that can solve difficult learning control problems. IEEE Transactions on Systems, Man, and Cybernetics, 13, 834-846.
- Doya K. (2000). Reinforcement learning in continuous time and space. Neural Computation, 12, 219-245.
- Schultz, W, Dayan, P & Montague, PR. (1997). “A neural substrate of prediction and reward”. Science 275 (5306): 1593-1599.
- Schweighofer, N., Tanaka, S. C., Doya, K. (2007). Serotonin and the evaluation of future rewards: Theory, experiments, and possible neural mechanisms. Annals of the New York Academy of Sciences, 14, 289-300.
- ABE Masato, “Automotive Vehicle Dynamics (Japanese),” Tokyo Denki University publication office, ISBN-13 978-4501419202, 2012.