Thomas Tran, University of Waterloo
We propose a reputation-oriented reinforcement learning algorithm for buying and selling agents in electronic market environments. We take into account the fact that multiple selling agents may offer the same good with different qualities. In our approach, buying agents learn to avoid the risk of purchasing low qualitiy goods and to maximize their expected value of goods by dynamically maintaining sets of reputable sellers. Selling agents learn to maximize their expected profits by adjusting product prices and by optionally altering the quality of their goods. We feel that our approach should lead to improved satisfaction for buying and selling agents and improved performance for buying agents (in terms of computational cost).