Mario Marchand, Mohak Shah, John Shawe-Taylor, and Marina Sokolova
We examine the set covering machine when it uses data-dependent half-spaces for its set of features and bound its generalization error in terms of the number of training errors and the number of half-spaces it achieves on the training data. We show that it provides a favorable alternative to data-dependent balls on some natural data sets. Compared to the support vector machine, the set covering machine with data-dependent halfspaces produces substantially sparser classifiers with comparable (and sometimes better) generalization. Furthermore, we show that our bound on the generalization error provides an effective guide for model selection.