Eugénio Oliviera, José Manuel Fonseca, and Nicholas R. Jennings
Agents that buy and sell goods or services in an electronic market need to adapt to the environment’s prevailing conditions if they are to be successful. Here we propose an on-line, continuous learning mechanism that is especially adapted for agents to learn how to behave when negotiating for resources (goods or services). Taking advantage of the specific characteristics of the price adaptation problem, where the different price states are ordered, we propose a specific reinforcement learning strategy that simultaneously allows good stability and fast convergence. Our method works by positively reinforcing all the lower value states if a particular state is successful and negatively reinforcing all the higher value states when a failure occurs. The resulting adaptive behaviour proved, in several different market situations, to perform better than non-adaptive agents and led to Nash equilibrium when faced with other adaptive opponents.