Aarti Gupta, Tim Oates
A variety of text processing tasks require or benefit from semantic resources such as ontologies and lexicons. Creating these resources manually is tedious, time consuming, and prone to error. We present a new algorithm for using the web to determine the correct concept in an existing ontology to lexicalize previously unknown words, such as might be discovered while processing texts. A detailed empirical comparison of our algorithm with two existing algorithms (Cilibrasi & Vitanyi 2004, Maedche et al. 2002) is described, leading to insights into the sources of the algorithms' strengths and weaknesses.
Subjects: 11.2 Ontologies; 12. Machine Learning and Discovery
Submitted: Oct 16, 2006