This content is not included in
your SAE MOBILUS subscription, or you are not logged in.
Using Reinforcement Learning and Simulation to Develop Autonomous Vehicle Control Strategies
Technical Paper
2020-01-0737
ISSN: 0148-7191, e-ISSN: 2688-3627
Annotation ability available
Sector:
Language:
English
Abstract
While machine learning in autonomous vehicles development has increased significantly in the past few years, the use of reinforcement learning (RL) methods has only recently been applied. Convolutional Neural Networks (CNNs) became common for their powerful object detection and identification and even provided end-to-end control of an autonomous vehicle. However, one of the requirements of a CNN is a large amount of labeled data to inform and train the neural network. While data is becoming more accessible, these networks are still sensitive to the format and collection environment which makes the use of others’ data more difficult. In contrast, RL develops solutions in a simulation environment through trial and error without labeled data. Our research expands upon previous research in RL and Proximal Policy Optimization (PPO) and the application of these algorithms to 1/18th scale cars by expanding the application of this control strategy to a full-sized passenger vehicle. By using this method of unsupervised learning, our research demonstrates the ability to learn new control strategies while in a simulated environment without the need for large amounts of real-world data. The use of simulation environments for RL is important as the unsupervised learning methodology requires many trials to learn appropriate desired behavior. Running this in the real-world would be expensive and impractical, however the simulation enables the solutions to be developed at low cost and time as the process can be accelerated beyond real-time. The simulation environment has high-fidelity to model vehicle dynamics as well as rendering capability for domain adaptation, and guarantees successful simulation-to-real world transfer. Traditional control algorithms are used on the strategies developed to ensure a proper mapping to the physical vehicle it is being applied to. This approach results in a low cost, low data solution that enables control of a full-sized, self-driving passenger vehicle.
Recommended Content
Authors
Topic
Citation
Navarro, A., Genc, S., Rangarajan, P., Khalil, R. et al., "Using Reinforcement Learning and Simulation to Develop Autonomous Vehicle Control Strategies," SAE Technical Paper 2020-01-0737, 2020, https://doi.org/10.4271/2020-01-0737.Also In
References
- Navarro , A. , Joerdening , J. , Khalil , R. , Brown , A. , and Asher , Z. Development of an Autonomous Vehicle Control Strategy Using a Single Camera and Deep Neural Networks SAE Technical Paper 2018-01-0035 2018 https://doi.org/10.4271/2018-01-0035
- Traverse , P. , Lacaze , I. , and Souyris , J. Airbus Fly-by-Wire: A Total Approach to Dependability Jacquart , R. Building the Information Society Boston, MA Springer US 2004 191 212
- https://github.com/PolySync/oscc
- Andrychowicz , M. , Wolski , F. , Ray , A. , Schneider , J. et al. Hindsight Experience Replay Advances in Neural Information Processing Systems 5048 5058 2017
- Andrychowicz , M. , Baker , B. , Chociej , M. , Jozefowicz , R. , McGrew , B. , Pachocki , J. , Petron , A. , Plappert , M. , Powell , G. , Ray , A. et al. 2018
- Gu , S. , Holly , E. , Lillicrap , T. , and Levine , S. Deep Reinforcement Learning for Robotic Manipulation with Asynchronous off-Policy Updates 2017 IEEE International Conference on Robotics and Automation (ICRA) 2017 3389 3396
- Rusu , A.A. , Večerík , M. , Rothörl , T. , Heess , N. , Pascanu , R. , and Hadsell , R. Sim-to-Real Robot Learning from Pixels with Progressive Nets Proceedings of the 1st Annual Conference on Robot Learning, ser, Proceedings of Machine Learning Research Levine , S. , Vanhoucke , V. , and Goldberg , K. 78 Nov. 13-15, 2017 262 270 http://proceedings.mlr.press/v78/rusu17a.html
- Hwangbo , J. , Lee , J. , Dosovitskiy , A. , Bellicoso , D. , Tsounis , V. , Koltun , V. , and Hutter , M. Learning Agile and Dynamic Motor Skills for Legged Robots Science Robotics 4 26 2019 https://robotics.sciencemag.org/content/4/26/eaau5872
- Xie , Z. , Berseth , G. , Clary , P. , Hurst , J. , and van de Panne , M. Feedback Control for Cassie with Deep Reinforcement Learning 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2018 1241 1246
- Hsu , S.-H. , Chan , S.-H. , Wu , P.-T. , Xiao , K. , and Fu , L.-C. Distributed Deep Reinforcement Learning Based Indoor Visual Navigation 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2018 2532 2537
- Choi , J. , Park , K. , Kim , M. , and Seok , S. Deep Reinforcement Learning of Navigation in a Complex and Crowded Environment with a Limited Field of View 2019 International Conference on Robotics and Automation (ICRA) 2019 5993 6000
- Zhu , Y. , Mottaghi , R. , Kolve , E. , Lim , J.J. , Gupta , A. , Fei-Fei , L. , and Farhadi , A. Target-Driven Visual Navigation in Indoor Scenes Using Deep Reinforcement Learning 2017 IEEE international conference on robotics and automation (ICRA) 2017 3357 3364
- Kahn , G. , Villaflor , A. , Ding , B. , Abbeel , P. , and Levine , S. Self-Supervised Deep Reinforcement Learning with Generalized Computation Graphs for Robot Navigation 2018 IEEE International Conference on Robotics and Automation (ICRA) 2018 1 8
- Kim , H.J. , Jordan , M.I. , Sastry , S. , and Ng , A.Y. Autonomous Helicopter Flight Via Reinforcement Learning Advances in Neural Information Processing Systems 2004 799 806
- Sadeghi , F. and Levine , S. CAD2RL: Real Singleimage Flight without a Single Real Image 2016
- Chen , C. , Liu , Y. , Kreiss , S. , and Alahi , A. Crowdrobot Interaction: Crowd-Aware Robot Navigation with Attention-Based Deep Reinforcement Learning 2019 International Conference on Robotics and Automation (ICRA) 2019 6015 6022
- Christen , S. , Stevsic , S. , and Hilliges , O. 2019
- Everett , M. , Chen , Y.F. , and How , J.P. Motion Planning among Dynamic, Decision-Making Agents with Deep Reinforcement Learning 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2018 3052 3059
- Sartoretti , G. , Kerr , J. , Shi , Y. , Wagner , G. et al. Primal: Pathfinding Via Reinforcement and Imitation Multi-Agent Learning IEEE Robotics and Automation Letters 4 3 2378 2385 2019
- Pedregosa , F. , Varoquaux , G. , Gramfort , A. , Michel , V. et al. Scikit-Learn: Machine Learning in Python Journal of Machine Learning Research 12 2825 2830 2011
- Abadi , M. , Agarwal , A. , Barham , P. , Brevdo , E. , Chen , Z. , Citro , C. , Corrado , G.S. , Davis , A. , Dean , J. , Devin , M. , Ghemawat , S. , Goodfellow , I. , Harp , A. , Irving , G. , Isard , M. , Jia , Y. , Jozefowicz , R. , Kaiser , L. , Kudlur , M. , Levenberg , J. , Mané , D. , Monga , R. , Moore , S. , Murray , D. , Olah , C. , Schuster , M. , Shlens , J. , Steiner , B. , Sutskever , I. , Talwar , K. , Tucker , P. , Vanhoucke , V. , Vasudevan , V. , Viégas , F. , Vinyals , O. , Warden , P. , Wattenberg , M. , Wicke , M. , Yu , Y. , and Zheng , X. 2015 tensorflow.org https://www.tensorflow.org/
- Caspi , I. , Leibovich , G. , Novik , G. , and Endrawis , S. Dec. 2017 https://doi.org/10.5281/zenodo.1134899
- Liang , E. , Liaw , R. , Nishihara , R. , Moritz , P. , Fox , R. , Goldberg , K. , Gonzalez , J.E. , Jordan , M.I. , and Stoica , I. RLlib: Abstractions for Distributed Reinforcement Learning International Conference on Machine Learning (ICML) 2018
- Brockman , G. , Cheung , V. , Pettersson , L. , Schneider , J. , Schulman , J. , Tang , J. , and Zaremba , W. 2016
- Crawley , D.B. , Lawrie , L.K. , Winkelmann , F.C. , Buhl , W.F. et al. EnergyPlus: Creating a New-Generation Building Energy Simulation Program Energy and Buildings 33 4 319 331 2001
- Koenig , N. and Howard , A. Design and Use Paradigms for Gazebo, an Open-Source Multi-Robot Simulator IEEE/RSJ International Conference on Intelligent Robots and Systems Sendai, Japan Sep 2004 2149 2154
- Karaman , S. , Anders , A. , Boulet , M. , Connor , J. , Gregson , K. , Guerra , W. , Guldner , O. , Mohamoud , M. , Plancher , B. , Shin , R. et al. Project-Based, Collaborative, Algorithmic Robotics for High School Students: Programming Self-Driving Race Cars at Mit 2017 IEEE Integrated STEM Education Conference (ISEC) 2017 195 203
- Brockman , G. , Cheung , V. , Pettersson , L. , Schneider , J. , Schulman , J. , Tang , J. , and Zaremba , W. 2016
- Fan , L. , Zhu , Y. , Zhu , J. , Liu , Z. , Zeng , O. , Gupta , A. , Creus-Costa , J. , Savarese , S. , and Fei-Fei , L. Surreal: Open-Source Reinforcement Learning Framework and Robot Manipulation Benchmark Conference on Robot Learning 2018 767 782
- Liang , J. , Makoviychuk , V. , Handa , A. , Chentanez , N. , Macklin , M. , and Fox , D. Gpu-Accelerated Robotic Simulation for Distributed Reinforcement Learning Proceedings of the 2nd Conference on Robot Learning, ser. Proceedings of Machine Learning Research Billard , A. , Dragan , A. , Peters , J. , and Morimoto , J. 87 Oct 29-31, 2018 270 282 http://proceedings.mlr.press/v87/liang18a.html
- Espeholt , L. , Soyer , H. , Munos , R. , Simonyan , K. , Mnih , V. , Ward , T. , Doron , Y. , Firoiu , V. , Harley , T. , Dunning , I. , Legg , S. , and Kavukcuoglu , K. IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures Proceedings of the 35th International Conference on Machine Learning, ser, Proceedings of Machine Learning Research Dy , J. and Krause , A. 80 Jul 10-15, 2018 1407 1416 http://proceedings.mlr.press/v80/espeholt18a.html
- Liang , E. , Liaw , R. , Nishihara , R. , Moritz , P. , Fox , R. , Goldberg , K. , Gonzalez , J. , Jordan , M. , and Stoica , I. RLlib: Abstractions for Distributed Reinforcement Learning Proceedings of the 35th International Conference on Machine Learning, ser. Proceedings of Machine Learning Research Dy , J. and Krause , A. 80 Jul 10-15, 2018 3053 3062 http://proceedings.mlr.press/v80/liang18b.html
- Tan , J. , Zhang , T. , Coumans , E. , Iscen , A. , Bai , Y. , Hafner , D. , Bohez , S. , and Vanhoucke , V. Sim-to-Real: Learning Agile Locomotion for Quadruped Robots Proceedings of Robotics: Science and Systems Pittsburgh, PA June 2018
- Peng , X.B. , Andrychowicz , M. , Zaremba , W. , and Abbeel , P. Sim-to-Real Transfer of Robotic Control with Dynamics Randomization 2018 IEEE International Conference on Robotics and Automation (ICRA) 2018 1 8
- Muratore , F. , Treede , F. , Gienger , M. , and Peters , J. Domain Randomization for Simulation-Based Policy Optimization with Transferability Assessment Conference on Robot Learning 2018 700 713
- Mandlekar , A. , Zhu , Y. , Garg , A. , Fei-Fei , L. , and Savarese , S. Adversarially Robust Policy Learning: Active Construction of Physically-Plausible Perturbations 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2017 3932 3939
- Higgins , I. , Pal , A. , Rusu , A. , Matthey , L. , Burgess , C. , Pritzel , A. , Botvinick , M. , Blundell , C. , and Lerchner , A. Darla: Improving Zero-Shot Transfer in Reinforcement Learning Proceedings of the 34th International Conference on Machine Learning 2017 1480 1490
- Bharadhwaj , H. , Wang , Z. , Bengio , Y. , and Paull , L. A Data-Efficient Framework for Training and Sim-Toreal Transfer of Navigation Policies 2019 International Conference on Robotics and Automation (ICRA) 2019 782 788
- https://github.com/PolySync/roscco
- Balaji , B. , Mallya , S. , Genc , S. , Gupta , S. , Dirac , L. , Khare , V. , Roy , G. , Sun , T. , Tao , Y. , Townsend , B. , Calleja , E. , Muralidhara , S. , and Karuppasamy , D. 2019
- Selvaraju , R.R. , Cogswell , M. , Das , A. , Vedantam , R. , Parikh , D. , and Batra , D. Grad-CAM: Visual Explanations from Deep Networks Via Gradient-Based Localization Proceedings of the IEEE International Conference on Computer Vision 2017 618 626