Using Reinforcement Learning and Simulation to Develop Autonomous Vehicle Control Strategies

Technical Paper

2020-01-0737

ISSN: 0148-7191, e-ISSN: 2688-3627

DOI: https://doi.org/10.4271/2020-01-0737

Published April 14, 2020 by SAE International in United States

Sector:

Automotive

Event: WCX SAE World Congress Experience

Language: English

Abstract

While machine learning in autonomous vehicles development has increased significantly in the past few years, the use of reinforcement learning (RL) methods has only recently been applied. Convolutional Neural Networks (CNNs) became common for their powerful object detection and identification and even provided end-to-end control of an autonomous vehicle. However, one of the requirements of a CNN is a large amount of labeled data to inform and train the neural network. While data is becoming more accessible, these networks are still sensitive to the format and collection environment which makes the use of others’ data more difficult. In contrast, RL develops solutions in a simulation environment through trial and error without labeled data. Our research expands upon previous research in RL and Proximal Policy Optimization (PPO) and the application of these algorithms to 1/18th scale cars by expanding the application of this control strategy to a full-sized passenger vehicle. By using this method of unsupervised learning, our research demonstrates the ability to learn new control strategies while in a simulated environment without the need for large amounts of real-world data. The use of simulation environments for RL is important as the unsupervised learning methodology requires many trials to learn appropriate desired behavior. Running this in the real-world would be expensive and impractical, however the simulation enables the solutions to be developed at low cost and time as the process can be accelerated beyond real-time. The simulation environment has high-fidelity to model vehicle dynamics as well as rendering capability for domain adaptation, and guarantees successful simulation-to-real world transfer. Traditional control algorithms are used on the strategies developed to ensure a proper mapping to the physical vehicle it is being applied to. This approach results in a low cost, low data solution that enables control of a full-sized, self-driving passenger vehicle.

Also In

References

Navarro , A. , Joerdening , J. , Khalil , R. , Brown , A. , and Asher , Z. Development of an Autonomous Vehicle Control Strategy Using a Single Camera and Deep Neural Networks SAE Technical Paper 2018-01-0035 2018 https://doi.org/10.4271/2018-01-0035
Traverse , P. , Lacaze , I. , and Souyris , J. Airbus Fly-by-Wire: A Total Approach to Dependability Jacquart , R. Building the Information Society Boston, MA Springer US 2004 191 212
https://github.com/PolySync/oscc
Andrychowicz , M. , Wolski , F. , Ray , A. , Schneider , J. et al. Hindsight Experience Replay Advances in Neural Information Processing Systems 5048 5058 2017
Andrychowicz , M. , Baker , B. , Chociej , M. , Jozefowicz , R. , McGrew , B. , Pachocki , J. , Petron , A. , Plappert , M. , Powell , G. , Ray , A. et al. 2018
Gu , S. , Holly , E. , Lillicrap , T. , and Levine , S. Deep Reinforcement Learning for Robotic Manipulation with Asynchronous off-Policy Updates 2017 IEEE International Conference on Robotics and Automation (ICRA) 2017 3389 3396
Rusu , A.A. , Večerík , M. , Rothörl , T. , Heess , N. , Pascanu , R. , and Hadsell , R. Sim-to-Real Robot Learning from Pixels with Progressive Nets Proceedings of the 1st Annual Conference on Robot Learning, ser, Proceedings of Machine Learning Research Levine , S. , Vanhoucke , V. , and Goldberg , K. 78 Nov. 13-15, 2017 262 270 http://proceedings.mlr.press/v78/rusu17a.html
Hwangbo , J. , Lee , J. , Dosovitskiy , A. , Bellicoso , D. , Tsounis , V. , Koltun , V. , and Hutter , M. Learning Agile and Dynamic Motor Skills for Legged Robots Science Robotics 4 26 2019 https://robotics.sciencemag.org/content/4/26/eaau5872
Xie , Z. , Berseth , G. , Clary , P. , Hurst , J. , and van de Panne , M. Feedback Control for Cassie with Deep Reinforcement Learning 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2018 1241 1246
Hsu , S.-H. , Chan , S.-H. , Wu , P.-T. , Xiao , K. , and Fu , L.-C. Distributed Deep Reinforcement Learning Based Indoor Visual Navigation 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2018 2532 2537
Choi , J. , Park , K. , Kim , M. , and Seok , S. Deep Reinforcement Learning of Navigation in a Complex and Crowded Environment with a Limited Field of View 2019 International Conference on Robotics and Automation (ICRA) 2019 5993 6000
Zhu , Y. , Mottaghi , R. , Kolve , E. , Lim , J.J. , Gupta , A. , Fei-Fei , L. , and Farhadi , A. Target-Driven Visual Navigation in Indoor Scenes Using Deep Reinforcement Learning 2017 IEEE international conference on robotics and automation (ICRA) 2017 3357 3364
Kahn , G. , Villaflor , A. , Ding , B. , Abbeel , P. , and Levine , S. Self-Supervised Deep Reinforcement Learning with Generalized Computation Graphs for Robot Navigation 2018 IEEE International Conference on Robotics and Automation (ICRA) 2018 1 8
Kim , H.J. , Jordan , M.I. , Sastry , S. , and Ng , A.Y. Autonomous Helicopter Flight Via Reinforcement Learning Advances in Neural Information Processing Systems 2004 799 806
Sadeghi , F. and Levine , S. CAD2RL: Real Singleimage Flight without a Single Real Image 2016
Chen , C. , Liu , Y. , Kreiss , S. , and Alahi , A. Crowdrobot Interaction: Crowd-Aware Robot Navigation with Attention-Based Deep Reinforcement Learning 2019 International Conference on Robotics and Automation (ICRA) 2019 6015 6022
Christen , S. , Stevsic , S. , and Hilliges , O. 2019
Everett , M. , Chen , Y.F. , and How , J.P. Motion Planning among Dynamic, Decision-Making Agents with Deep Reinforcement Learning 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2018 3052 3059
Sartoretti , G. , Kerr , J. , Shi , Y. , Wagner , G. et al. Primal: Pathfinding Via Reinforcement and Imitation Multi-Agent Learning IEEE Robotics and Automation Letters 4 3 2378 2385 2019
Pedregosa , F. , Varoquaux , G. , Gramfort , A. , Michel , V. et al. Scikit-Learn: Machine Learning in Python Journal of Machine Learning Research 12 2825 2830 2011
Abadi , M. , Agarwal , A. , Barham , P. , Brevdo , E. , Chen , Z. , Citro , C. , Corrado , G.S. , Davis , A. , Dean , J. , Devin , M. , Ghemawat , S. , Goodfellow , I. , Harp , A. , Irving , G. , Isard , M. , Jia , Y. , Jozefowicz , R. , Kaiser , L. , Kudlur , M. , Levenberg , J. , Mané , D. , Monga , R. , Moore , S. , Murray , D. , Olah , C. , Schuster , M. , Shlens , J. , Steiner , B. , Sutskever , I. , Talwar , K. , Tucker , P. , Vanhoucke , V. , Vasudevan , V. , Viégas , F. , Vinyals , O. , Warden , P. , Wattenberg , M. , Wicke , M. , Yu , Y. , and Zheng , X. 2015 tensorflow.org https://www.tensorflow.org/
Caspi , I. , Leibovich , G. , Novik , G. , and Endrawis , S. Dec. 2017 https://doi.org/10.5281/zenodo.1134899
Liang , E. , Liaw , R. , Nishihara , R. , Moritz , P. , Fox , R. , Goldberg , K. , Gonzalez , J.E. , Jordan , M.I. , and Stoica , I. RLlib: Abstractions for Distributed Reinforcement Learning International Conference on Machine Learning (ICML) 2018
Brockman , G. , Cheung , V. , Pettersson , L. , Schneider , J. , Schulman , J. , Tang , J. , and Zaremba , W. 2016
Crawley , D.B. , Lawrie , L.K. , Winkelmann , F.C. , Buhl , W.F. et al. EnergyPlus: Creating a New-Generation Building Energy Simulation Program Energy and Buildings 33 4 319 331 2001
Koenig , N. and Howard , A. Design and Use Paradigms for Gazebo, an Open-Source Multi-Robot Simulator IEEE/RSJ International Conference on Intelligent Robots and Systems Sendai, Japan Sep 2004 2149 2154
Karaman , S. , Anders , A. , Boulet , M. , Connor , J. , Gregson , K. , Guerra , W. , Guldner , O. , Mohamoud , M. , Plancher , B. , Shin , R. et al. Project-Based, Collaborative, Algorithmic Robotics for High School Students: Programming Self-Driving Race Cars at Mit 2017 IEEE Integrated STEM Education Conference (ISEC) 2017 195 203
Brockman , G. , Cheung , V. , Pettersson , L. , Schneider , J. , Schulman , J. , Tang , J. , and Zaremba , W. 2016
Fan , L. , Zhu , Y. , Zhu , J. , Liu , Z. , Zeng , O. , Gupta , A. , Creus-Costa , J. , Savarese , S. , and Fei-Fei , L. Surreal: Open-Source Reinforcement Learning Framework and Robot Manipulation Benchmark Conference on Robot Learning 2018 767 782
Liang , J. , Makoviychuk , V. , Handa , A. , Chentanez , N. , Macklin , M. , and Fox , D. Gpu-Accelerated Robotic Simulation for Distributed Reinforcement Learning Proceedings of the 2nd Conference on Robot Learning, ser. Proceedings of Machine Learning Research Billard , A. , Dragan , A. , Peters , J. , and Morimoto , J. 87 Oct 29-31, 2018 270 282 http://proceedings.mlr.press/v87/liang18a.html
Espeholt , L. , Soyer , H. , Munos , R. , Simonyan , K. , Mnih , V. , Ward , T. , Doron , Y. , Firoiu , V. , Harley , T. , Dunning , I. , Legg , S. , and Kavukcuoglu , K. IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures Proceedings of the 35th International Conference on Machine Learning, ser, Proceedings of Machine Learning Research Dy , J. and Krause , A. 80 Jul 10-15, 2018 1407 1416 http://proceedings.mlr.press/v80/espeholt18a.html
Liang , E. , Liaw , R. , Nishihara , R. , Moritz , P. , Fox , R. , Goldberg , K. , Gonzalez , J. , Jordan , M. , and Stoica , I. RLlib: Abstractions for Distributed Reinforcement Learning Proceedings of the 35th International Conference on Machine Learning, ser. Proceedings of Machine Learning Research Dy , J. and Krause , A. 80 Jul 10-15, 2018 3053 3062 http://proceedings.mlr.press/v80/liang18b.html
Tan , J. , Zhang , T. , Coumans , E. , Iscen , A. , Bai , Y. , Hafner , D. , Bohez , S. , and Vanhoucke , V. Sim-to-Real: Learning Agile Locomotion for Quadruped Robots Proceedings of Robotics: Science and Systems Pittsburgh, PA June 2018
Peng , X.B. , Andrychowicz , M. , Zaremba , W. , and Abbeel , P. Sim-to-Real Transfer of Robotic Control with Dynamics Randomization 2018 IEEE International Conference on Robotics and Automation (ICRA) 2018 1 8
Muratore , F. , Treede , F. , Gienger , M. , and Peters , J. Domain Randomization for Simulation-Based Policy Optimization with Transferability Assessment Conference on Robot Learning 2018 700 713
Mandlekar , A. , Zhu , Y. , Garg , A. , Fei-Fei , L. , and Savarese , S. Adversarially Robust Policy Learning: Active Construction of Physically-Plausible Perturbations 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2017 3932 3939
Higgins , I. , Pal , A. , Rusu , A. , Matthey , L. , Burgess , C. , Pritzel , A. , Botvinick , M. , Blundell , C. , and Lerchner , A. Darla: Improving Zero-Shot Transfer in Reinforcement Learning Proceedings of the 34th International Conference on Machine Learning 2017 1480 1490
Bharadhwaj , H. , Wang , Z. , Bengio , Y. , and Paull , L. A Data-Efficient Framework for Training and Sim-Toreal Transfer of Navigation Policies 2019 International Conference on Robotics and Automation (ICRA) 2019 782 788
https://github.com/PolySync/roscco
Balaji , B. , Mallya , S. , Genc , S. , Gupta , S. , Dirac , L. , Khare , V. , Roy , G. , Sun , T. , Tao , Y. , Townsend , B. , Calleja , E. , Muralidhara , S. , and Karuppasamy , D. 2019
Selvaraju , R.R. , Cogswell , M. , Das , A. , Vedantam , R. , Parikh , D. , and Batra , D. Grad-CAM: Visual Explanations from Deep Networks Via Gradient-Based Localization Proceedings of the IEEE International Conference on Computer Vision 2017 618 626

Citation
	(MAC / WIN)
	(MAC / WIN)
	(MAC / WIN)
	(MAC / WIN)

File Format:	Number of Results:
Select Fields to Export
Item Number	Content Type
Title	Author(s)
Affiliation(s)	Publisher
Publish Date	Abstract/scope
Citation	DOI

Technical Paper	Vehicle Velocity Prediction and Energy Management Strategy Part 2: Integration of Machine Learning Vehicle Velocity Prediction with Optimal Energy Management to Improve Fuel Economy
Technical Paper	Control Synthesis for Distributed Vehicle Platoon Under Different Topological Communication Structures
Technical Paper	Noise analysis and modeling with neural networks and genetic algorithms

Using Reinforcement Learning and Simulation to Develop Autonomous Vehicle Control Strategies

Abstract

Recommended Content

Authors

Topic

Citation

Also In

References

Cited By