Metrics for Finite Markov Decision Processes

Norm Ferns, Prakash Panangaden, and Doina Precup

The notion of equivalence for stochastic processes is problematic because it requires that the transition probabilities agree exactly. This is not a robust concept, especially considering that usually, the numbers used in probabilistic models come from experimentation or are approximate estimates; what is needed is a quantitative notion of equivalence. In our work we provide such a notion via semimetrics distance functions on the state space that assign distance quantifying “how equivalent” states are. These semimetrics could potentially be used as a new theoretical tool to analyze current state compression algorithms for MDPs, or in practice to guide state aggregation directly. The ultimate goal of this research is to efficiently compress and analyze continuous state space MDPs. Here we focus on finite MDPs, but note that most of our results should hold, with slight modifications, in the context of continuous state spaces.

This page is copyrighted by AAAI. All rights reserved. Your use of this site constitutes acceptance of all of AAAI's terms and conditions and privacy policy.