AAAI Publications, Twenty-Third International FLAIRS Conference

Font Size: 
On the Episode Duration Distribution in Fixed-Policy Markov Decision Processes
Itamar Arel, Andrew S. Davis

Last modified: 2010-05-06


This paper presents a formalism for determining the episode duration distribution in fixed-policy Markov decision processes (MDP). To achieve this goal, we borrow the notion of obtaining the n{th}-step first visit probability from queuing theory, apply it to a Markov chain derived from the MDP, and arrive at the distribution of the episode durations between any two arbitrary states. We illustrate the proposed methodology with an agent navigating a 25-state maze, demonstrating the applicability of the method.


episode duration distribution; markov decision processes;

Full Text: PDF