Recommendation as a Stochastic Sequential Decision Problem

Ronen Brafman, David Heckerman, and Guy Shani

Recommender systems -- systems that suggest to users in e-commerce sites items that might interest them -- adopt a static view of the recommendation process and treat it as a prediction problem. In an earlier paper, we argued that it is more appropriate to view the problem of generating recommendations as a sequential decision problem and, consequently, that Markov decision processes (MDPs) provide a more appropriate model for recommender systems. MDPs introduce two benefits: they take into account the long-term effects of each recommendation, and they take into account the expected value of each recommendation. The use of MDPs in a commercial site raises three fundamental problems: providing an adequate initial model, updating this model online as new items (e.g., books) arrive, and coping with the enormous state-space of this model. In past work, we dealt with the first problem. In this paper we consider the second, and especially, the third problem, which is of greater concern to researchers in decision-theoretic planning. We show that although the model we consider has roughly 1011 states, we can quickly provide an approximate solution by utilizing its special structure. Our memory requirements -- a serious concern for commercial online applications -- are modest; and the overall resource requirements of our system are comparable to those of a well-known commercial recommender system that uses a simpler and less accurate model. Our system is one of a handful of deployed commercial recommender systems as well as one of a handful of MDP-based deployed systems. It has been running at, a commercial online bookstore, since August, 2002.

This page is copyrighted by AAAI. All rights reserved. Your use of this site constitutes acceptance of all of AAAI's terms and conditions and privacy policy.