Theodore J. Perkins and Andrew G. Barto
We propose a general approach to safe reinforcement learning control based on Lyapunov design methods. In our approach, a Lyapunov function---a special form of domain knowledge---is used to formulate the action choices available to a reinforcement learning agent. A learning agent choosing among these actions provably enjoys performance guarantees, and satisfies safety constraints of various kinds. We demonstrate the general approach by applying it to several illustrative pendulum control problems.