Takuya Matsuzaki, Yusuke Miyao, Jun’ichi Tsujii
An efficient parsing technique for HPSG is presented. Recent research has shown that supertagging is a key technology to improve both the speed and accuracy of lexicalized grammar parsing. We show that further speed-up is possible by eliminating non-parsable lexical entry sequences from the output of the supertagger. The parsability of the lexical entry sequences is tested by a technique called CFG-filtering, where a CFG that approximates the HPSG is used to test it. Those lexical entry sequences that passed through the CFG-filter are combined into parse trees by using a simple shift-reduce parsing algorithm, in which structural ambiguities are resolved using a classifier and all the syntactic constraints represented in the original grammar are checked. Experimental results show that our system gives comparable accuracy with a speed-up by a factor of six (30 msec/sentence) compared with the best published result using the same grammar.
Subjects: 13. Natural Language Processing; 13.3 Syntax
Submitted: Oct 11, 2006