Maria Cutumisu, Duane Szafron, Michael Bowling, Richard S. Sutton
We introduce the ALeRT (Action-dependent Learning Rates with Trends) algorithm that makes two modifications to the learning rate and one change to the exploration rate of traditional reinforcement learning techniques. Our learning rates are action-dependent and increase or decrease based on trends in reward sequences. Our exploration rate decreases when the agent is learning successfully and increases otherwise. These improvements result in faster learning. We implemented this algorithm in NWScript, a scripting language used by BioWare Corp.'s Neverwinter Nights game, with the goal of improving the behaviours of game agents so that they react more intelligently to game events. Our goal is to provide an agent with the ability to (1) discover favourable policies in a multi-agent computer role-playing game situation and (2) adapt to sudden changes in the environment.
Subjects: 12.1 Reinforcement Learning; 6.1 Life-Like Characters
Submitted: Aug 9, 2008