LQR Gain Tuning for On-Demand Performance Using Reinforcement Learning

Authors

Jean-Paul Reddinger

Abstract

Content: A reinforcement learning agent is trained to optimally scale the weighting matrices of a linear quadratic regulator for full state feedback control. Training is done using the model-free Deep Deterministic Policy Gradient on a dynamic simulation of the linearized model of an AeroQuad Cyclone ARF quadrotor. In the first task, the reinforcement learning agent is designed to produce an optimal control response for a step input of between 0 and 20 meters. The optimality target is randomly varied between incentivizing low power and incentivizing rapid settling times, and the agent is able to learn an optimal control policy from simulated experience. The second task applied the learning algorithm to an energy budgeting task. The mission consisted of a randomly selected number of repetitive tasks and a randomly initialized amount of available on-board energy. The agent learned to budget energy consumption to maximize the likelihood that the entire mission would be successful. Conversely, when sufficient energy was available it allows greater control effort from the controller.

Meta Tags

Topics: Machine learning
Optimization
Energy consumption
Measurements
Simulation and modeling
Weather and climate

Details

Citation: Reddinger, J., "LQR Gain Tuning for On-Demand Performance Using Reinforcement Learning," Vertical Flight Society 74th Annual Forum and Technology Display, Phoenix, Arizona, May 14, 2018, https://doi.org/10.4050/F-0074-2018-12916.

Additional Details