Sanda M. Harabagiu and Marius Pasca
This paper presents a novel methodology of resolving prepositional phrase attachment ambiguities. The approach consists of three phases. First, we rely on a publicly available database to classify a large corpus of prepositional attachments extracted from the Tree-bank parses. As a by-product, the arguments of every prepositional relation are semantically disambiguated. In the second phase, the thematic interpretation of the prepositional relations provides additional knowledge. The third phase is concerned with learning attachment decisions from word class knowledge and relation type features. The learning technique builds upon some of the most popular current statistical techniques. We have tested this methodology on (1) Wall Street Journal articles, (2) textual definitions of concepts from a dictionary and (3) an ad-hoc corpus of Web documents, used for conceptual indexing and information extraction.