Håkan L. S. Younes
We consider a special type of continuous-time Markov decision processes (MDPs) that arise when phase-type distributions are used to model the timing of non-Markovian events and actions. We focus, primarily, on the execution of phase-dependent policies. Phases are introduced into a model to represent relevant execution history, but there is no physical manifestation of phases in the real world. We treat phases as partially observable state features and show how a belief distribution over phase configurations can be derived from observable state features through the use of transient analysis for Markov chains. This results in an efficient method for phase tracking during execution that can be combined with the QMDP value method for POMDPs to make action choices. We also discuss, briefly, how the structure of MDPs with phase transitions can be exploited in structured value iteration with symbolic representation of vectors and matrices.
Subjects: 1.11 Planning; 3.4 Probabilistic Reasoning
Submitted: May 2, 2005