Xiaoyuan Su, Taghi M. Khoshgoftaar, Russell Greiner
Recommendation systems suggest products to users. Collaborative filtering (CF) systems, which base those recommendations on a database of previous ratings by various users and products, have been proven to be very effective. Since this database is typically very sparse, we consider first imputing the missing values, then making predictions based on that completed dataset. In this paper, we apply several standard imputation techniques within the framework of imputation-boosted collaborative filtering (IBCF). Each technique passes that imputed rating data to a traditional Pearson correlation-based CF algorithm, which uses that information to produce CF predictions. We also propose a novel mixture IBCF algorithm, IBCF-NBM, that uses either naïve Bayes or mean imputation, depending on the sparsity of the original CF rating dataset. Our empirical results show that IBCFs are fairly accurate on CF tasks, and that IBCF-NBM significantly outperforms a representative hybrid CF system, content-boosted CF algorithm, as well as other IBCFs that use standard imputation techniques.
Subjects: 12. Machine Learning and Discovery; 10. Knowledge Acquisition
Submitted: Feb 22, 2008