Thanh Phong Pham, Hwee Tou Ng, Wee Sun Lee
Content Area: 14. Natural Language Processing & Speech Recognition
Current word sense disambiguation (WSD) systems based on supervised learning are still limited in that they do not work well for all words in a language. One of the main reasons is the lack of sufficient training data. In this paper, we investigate the use of unlabeled training data for WSD, in the framework of semi-supervised learning. Four semi-supervised learning algorithms are evaluated on 29 nouns of Senseval-2 (SE2) English lexical sample task and SE2 English all-words task. Empirical results show that unlabeled data can bring significant improvement in WSD accuracy.
Content Area: 14. Natural Language Processing & Speech Recognition
Subjects: 13. Natural Language Processing
Submitted: May 10, 2005
This page is copyrighted by AAAI. All rights reserved. Your use of this site constitutes acceptance of all of AAAI's terms and conditions and privacy policy.