LQR Gain Tuning for On-Demand Performance Using Reinforcement Learning
F-0074-2018-12916
5/14/2018
- Content
-
A reinforcement learning agent is trained to optimally scale the weighting matrices of a linear quadratic regulator for full state feedback control. Training is done using the model-free Deep Deterministic Policy Gradient on a dynamic simulation of the linearized model of an AeroQuad Cyclone ARF quadrotor. In the first task, the reinforcement learning agent is designed to produce an optimal control response for a step input of between 0 and 20 meters. The optimality target is randomly varied between incentivizing low power and incentivizing rapid settling times, and the agent is able to learn an optimal control policy from simulated experience. The second task applied the learning algorithm to an energy budgeting task. The mission consisted of a randomly selected number of repetitive tasks and a randomly initialized amount of available on-board energy. The agent learned to budget energy consumption to maximize the likelihood that the entire mission would be successful. Conversely, when sufficient energy was available it allows greater control effort from the controller.
- Citation
- Reddinger, J., "LQR Gain Tuning for On-Demand Performance Using Reinforcement Learning," Vertical Flight Society 74th Annual Forum and Technology Display, Phoenix, Arizona, May 14, 2018, https://doi.org/10.4050/F-0074-2018-12916.