Ron Katz, Sarit Kraus
In this paper we propose a model for human learning and decision making in environments of repeated Cliff-Edge (CE) interactions. In CE environments, which include common daily interactions, such as sealed-bid auctions and the Ultimatum Game (UG), the probability of success decreases monotonically as the expected reward increases. Thus, CE environments are characterized by an underlying conflict between the strive to maximize profits and the fear of causing the entire deal to fall through. We focus on the behavior of people who repeatedly compete in one-shot CE interactions, with a different opponent in each interaction. Our model, which is based upon the Deviated Virtual Reinforcement Learning (DVRL) algorithm, integrates the Learning Direction Theory with the Reinforcement Learning algorithm. We also examined several other models, using an innovative methodology in which the decision dynamics of the models were compared with the empirical decision patterns of individuals during their interactions. An analysis of human behavior in auctions and in the UG reveals that our model fits the decision patterns of far more subjects than any other model.
Subjects: 4. Cognitive Modeling; 12.1 Reinforcement Learning