Stefanie Brueninghaus and Kevin D. Ashley
In this paper, we discuss the benefits and limitations of Machine Learning (ML) for Case-Based Reasoning (CBR) in domains where the cases are text documents. In textual CBR, the bottleneck is often indexing new cases. While ML has the potential to help build large case-bases from a small start-up collection by learning to classify texts under the index-terms, we found in experiments with a real CBR system, that the problem is often beyond the power of purely inductive ML. CBR indices are very complex and the number of training instances in a typical case base is too small reliably to generalize from. We argue that adding domain knowledge can help overcome these problems and give illustrating examples.