Branislav Kveton, Milos Hauskrecht
Markov decision processes (MDPs) with discrete and continuous state and action components can be solved efficiently by hybrid approximate linear programming (HALP). The main idea of the approach is to approximate the optimal value function by a linear combination of basis functions and optimize it by linear programming. In this paper, we extend the existing HALP paradigm beyond the mixture of beta transition model. As a consequence, we permit other transition functions, such as normals, without approximating them. Moreover, we identify a large class of basis functions that match these transition models and yield an efficient solution to the expectation terms in HALP. Finally, we apply the generalized HALP framework to solve a rover planning problem, which involves continuous time and resource uncertainty.