Douglas J. Pearson
Autonomous agents functioning in complex and rapidly changing environments can improve their task performance if they update and correct their world model over the life of the agent. Existing research on this problem can be divided into two classes. First, reinforcement learners that use weak inductive methods to directly modify an agent’s procedural execution knowledge. These systems are robust in dynamic and complex environments but generally do not support planning or the pursuit of multiple goals and learn slowly as a result of their weak methods. In contrast, the second category, theory revision systems, learn declarative planning knowledge through stronger methods that use explicit reasoning to identify and correct errors in the agent’s domain knowledge. However, these methods are generally only applicable to agents with instantaneous actions in fully sensed domains.