Sebastian Pado, Mirella Lapata
This paper considers the problem of unsupervised semantic lexicon acquisition. We introduce a fully automatic approach which exploits parallel corpora, relies on shallow text properties, and is relatively inexpensive. Given the English FrameNet lexicon, our method exploits word alignments to generate frame candidate list for new languages, which are subsequently pruned automatically using a small set of linguistically motivated filters. Evaluation shows that our approach can produce high-precision multilingual FrameNet lexicons without recourse to bilingual dictionaries or deep syntactic and semantic analysis.
Content Area: 14. Natural Language Processing & Speech Recognition
Subjects: 13. Natural Language Processing; 11.2 Ontologies
Submitted: May 10, 2005