William W. Cohen, Daniel Kudenko
Any system that learns how to filter documents will suffer poor performance during an initial training phase. One way of addressing this problem is to exploit filters learned by other users in a collaborative fashion. We investigate "direct transfer" of learned filters in this setting-a limiting case for any collaborative learning system. We evaluate the stability of several different learning methods under direct transfer, and conclude that symbolic learning methods that use negatively correlated features of the data perform poorly in transfer, even when they perform well in more conventional evaluation settings. This effect is robust: it holds for several learning methods, when a diverse set of users is used in training the classifier, and even when the learned classifiers can be adapted to the new user’s distribution. Our experiments give rise to several concrete proposals for improving generalization performance in a collaborative setting, including a beneficial variation on a feature selection method that has been widely used in text categorization.