AAAI Publications, Twenty-First International Joint Conference on Artificial Intelligence

Font Size: 
Web-Scale N-gram Models for Lexical Disambiguation
Shane Bergsma, Dekang Lin, Randy Goebel

Last modified: 2009-06-26

Abstract


Web-scale data has been used in a diverse range of language research. Most of this research has used web counts for only short, fixed spans of context. We present a unified view of using web counts for lexical disambiguation. Unlike previous approaches, our supervised and unsupervised systems combine information from multiple and overlapping segments of context. On the tasks of preposition selection and context-sensitive spelling correction, the supervised system reduces disambiguation error by 20-24% over the current state-of-the-art.

Full Text: PDF