Patrick Riley and Manuela Veloso
An advising agent, a coach, provides advice to other agents about how to act. In this paper we contribute an advice generation method using observations of agents acting in an environment. Given an abstract state definition and partially specified abstract actions, the algorithm extracts a Markov Chain, infers a Markov Decision Process, and then solves the MDP (given an arbitrary reward signal) to generate advice. We evaluate our work in a simulated robot soccer environment and experimental results show improved agent performance when using the advice generated from the MDP for both a sub-task and the full soccer game.