Bernhard Scholkopf, Chris Burges, Vladimir Vapnik
We report a novel possibility for extracting a small subset of a data base which contains all the information necessary to solve a given classification task: using the Support Vector Algorithm to train three different types of handwritten digit classifiers, we observed that these types of classifiers construct their decision surface from strongly overlapping small (k: 4%) subsets of the data base. This finding opens up the possibiiity of compressing data bases significantly by disposing of the data which is not important for the solution of a given task. In addition, we show that the theory allows us to predict the classifier that will have the best generalization ability, based solely on performance on the training set and characteristics of the learning machines. This finding is important for cases where the amount of available data is limited.