AAAI Publications, The Twenty-Eighth International Flairs Conference

Font Size: 
A Multiple-Play Bandit Algorithm Applied to Recommender Systems
Jonathan Louedec, Max Chevalier, Josiane Mothe, Aurélien Garivier, Sébastien Gerchinovitz

Last modified: 2015-04-07


For several web tasks such as ad placement or e-commerce, recommender systems must recommend multiple items to their users — such problems can be modeled as bandits with multiple plays. State-of-the-art methods require running as many single-play bandit algorithms as there are items to recommend. On the contrary, some recent theoretical work in the machine learning literature designed new algorithms to address the multiple-play case directly. These algorithms were proved to have strong theoretical guarantees. In this paper we compare one such multiple-play algorithm with previous methods. We show on two real-world datasets that the multiple-play algorithm we use converges to equivalent values but learns about three times faster than state-of-the-art methods. We also show that carefully adapting these earlier methods can improve their performance.


Information Retrieval; Recommender Systems; Reinforcement Learning; Bandits Problem

Full Text: PDF