Robert Givan and Thomas Dean
We have recently developed a method for solving implicit (or factored) Markov decision processes (MDPs) with very large state spaces. This method reduces a large implicit MDP to a family of related smaller explicit MDP’s which we call an "approximate MDP". An approximate MDP defines a set of (exact) MDPs by specifying upper and lower bounds the transition probabilities and rewards. We have also developed algorithms that operate on approximate MDPs to estimate value functions and find approximately optimal policies.