Reinforcement Learning Algorithms for Average-Payoff Markovian Decision Processes

Authors

Satinder P. Singh

Proceedings:

Machine Learning

Volume

Issue:

Proceedings of the AAAI Conference on Artificial Intelligence, 12

Track:

Reinforcement Learning

Downloads:

Download PDF

Abstract:

Reinforcement learning (RL) has become a central paradigm for solving learning-control problems in robotics and artificial intelligence. R L researchers have focussed almost exclusively on problems where the controller has to maximize the discounted sum of payoffs. However, as emphasized by Schwartz (1993), in many problems, e.g., those for which the optimal behavior is a limit cycle, it is more natural and computationally advantageous to formulate tasks so that the controller’s objective is to maximize the average payoff received per time step. In this paper I derive new average-payoff RL algorithms as stochastic approximation methods for solving the system of equations associated with the policy evaluation and optimal control questions in average-payoff RL tasks. These algorithms are analogous to the popular td and Q-learning algorithms already developed for the discounted-payoff case. One of the algorithms clerived here is a significant variation of Schwartz’s R-learning algorithm. Preliminary empirical results are presented to validate these new algorithms.

AAAI

Proceedings of the AAAI Conference on Artificial Intelligence, 12

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.