Raw Corpus Word Sense Disambiguation

Ted Pedersen

A wide range of approaches have been applied to word sense disambiguation. However, most require manually crafted knowledge such as annotated text, machine readable dictionaries or thesari, semantic networks, or aligned bilingual corpora. The reliance on these knowledge sources limits portability since they generally exist only for selected domains and languages. This poster presents a corpus{based approach where multiple usages of an ambiguous word are divided into a specified number of sense groups based strictly on features that are automatically obtained from the immediately surrounding raw text.


This page is copyrighted by AAAI. All rights reserved. Your use of this site constitutes acceptance of all of AAAI's terms and conditions and privacy policy.