AAAI Publications, Twenty-Fifth International FLAIRS Conference

Font Size: 
Syntagmatic, Paradigmatic, and Automatic N-Gram Approaches to Assessing Essay Quality
Scott Crossley, Zhiqiang Cai, Danielle S. McNamara

Last modified: 2012-05-16


Computational indices related to n-gram production were developed in order to assess the potential for n-gram indices to predict human scores of essay quality. A regression analyses was conducted on a corpus of 313 argumentative essays. The analyses demonstrated that a variety of n-gram indices were highly correlated to essay quality, but were also highly correlated to the number of words in the text (although many of the n-gram indices were stronger predictors of writing quality than the number of words in a text). A second regression analysis was conducted on a corpus of 88 argumentative essays that were controlled for text length differences. This analysis demonstrated that n-gram indices were still strong predictors of essay quality when text length was not a factor.


writing quality; n-grams; collocations; lexical bundles

Full Text: PDF