Bill Gale, Kenneth Church, and David Yarowsky
Our task is to tag nouns in a corpus with a "sense," one of a small list of tags for each noun. Most of the previous work falls into one of three camps: (1) Qualitative Methods, Dictionary-based Methods, and (3) Discrimination Methods. In each case, the work has been limited by an inability to get sufficiently large sets of training material. Our work falls in the third group. These methods take sets of training examples as input and develop some means for discriminating the sets based on this training data. The bottleneck for this approach has been the acquisition of training material. Most previous work in this line has used hand tagged sets of examples. In our view, the crux of the problem in developing discrimination methods for word sense disambiguation has been to find a strategy for acquiring a sufficiently large sets of training material. We think that we have found two such strategies for acquiring testing and training materials, one of which we have discussed previously, but will review here, and another discussed for the first time. Beyond the classical discrimination problem lie various problems in building a practical system, the most pressing of which is to limit the number of parameters: If there are about 105 senses to be discriminated, a strategy based on the direct approach of a model per sense will allow only a few parameters for each sense. Much of this paper addresses this problem.