Dan Winchester and Mark Lee
Proper Names are a frequent occurrence in all types of natural language text. However, the treatment of proper names is an area under-researched by Natural Language Processing. One particular problem is how to link information about the same entity referred to by possibly different proper names in several documents. In this paper we describe a prototype system which first pre-processes individual documents using a simple name-conflation algorithm and then uses an adaptation of Schutze’s contextgroup discrimination algorithm to cluster documents that are judged to contain references to the same named entity. We use this system to assess the potential utility of different contextual cues to the task.