AAAI Publications, Workshops at the Twenty-Eighth AAAI Conference on Artificial Intelligence

Font Size: 
Integrating Representation Learning and Temporal Difference Learning: A Matrix Factorization Approach
Martha White

Last modified: 2014-06-18


Reinforcement learning is a general formalism for sequential decision-making, with recent algorithm development focusing on function approximation to handle large state spaces and high-dimensional, high-velocity (sensor) data. The success of function approximators, however, hinges on the quality of the data representation. In this work, we explore representation learning within least-squares temporal difference learning (LSTD), with a focus on making the assumptions on the representation explicit and making the learning problem amenable to principled optimization techniques. We reformulate LSTD as a least-squares loss plus concave regularizer, facilitating the addition of a regularized matrix factorization objective to specify the desired class of representations. The resulting joint optimization over the representation and value function parameters enables us to take advantages of recent advances in unsupervised learning and presents a general yet simple formalism for learning representations in reinforcement learning.


einforcement learning; representation learning; sparse learning

Full Text: PDF