Dmitri Dolgov, Edmund Durfee
A weakness of classical Markov decision processes is that they scale very poorly due to the flat state-space representation. Factored MDPs attempt to address this by exploiting problem structure. However, in general, solutions to factored MDPs do not retain the structure and compactness of the problem representation, forcing approximate solutions, with approximate linear programming (ALP) emerging as a very promising MDP-approximation technique. However, the ALP work has focused on approximating the primal LP, and no effort has been invested in approximating the dual LP, which serves as the basis for solving a wide range of constrained MDPs. Our analysis of the dual LP shows that a straightforward application of linear approximations is not as well-suited for the dual, because some of the required computations cannot be carried out efficiently. Nonetheless, we demonstrate that this can be resolved by a method that approximates both the primal and the dual optimization coordinates, resulting in an ALP that is well-suited for constrained problems. It effectively approximates both the optimization coordinates and the feasible regions of the LPs, and thus also serves as a new method for a widely-discussed problem of dealing with exponentially many constraints, which plagues both the primal and the dual ALP formulations.
Subjects: 3.4 Probabilistic Reasoning; 15.4 Reactive Control
Submitted: Apr 5, 2005