AAAI Publications, The Twenty-Seventh International Flairs Conference

Font Size: 
Part of Speech Induction from Distributional Features: Balancing Vocabulary and Context
Vivek V. Datla, King-Ip Lin, Max Louwerse

Last modified: 2014-05-03

Abstract


Past research on grammar induction has found promising results in predicting parts-of-speech from n-grams using a fixed vocabulary and a fixed context. In this study, we investigated grammar induction whereby we varied vocabulary size and context size. Results indicated that as context increased for a fixed vocabulary, overall accuracy initially increased but then leveled off. Importantly, this increase in accuracy did not occur at the same rate across all syntactic categories. We also address the dynamic relation between context and vocabulary in terms of grammar induction in an unsupervised methodology. We formulate a model that represents a relationship between vocabulary and context for grammar induction. Our results concur with what has been called the word spurt phenomenon in the child language acquisition literature.

Keywords


Grammar Induction, Distributional features, Clustering, language acquisition

Full Text: PDF