Stephan Schulz, Technische Universität München, Germany
We model the learning of classifications as a combination of abstraction and class assignment. We discuss the problem of selecting the most suitable of multiple abstractions for this purpose. Weaker abstractions perform better on training sets, but typically do not generalize very well. Stronger abstractions often generalize better, but may fail to include important properties. We introduce the relative information gain as a criterion to determine an optimal balance between precision and generality of abstractions. Experimental results with abstractions used for the classification of terms indicate the success of this approach.