Scott Crossley, Philip McCarthy, Danielle McNamara
Text classification remains one of the major fields of research in natural language processing. This paper evaluates the use of the computational tool Coh-Metrix as a means to distinguish between seemingly similar text-types. Using a discriminant analysis on a corpus of second language reading texts, this paper demonstrates that Coh-Metrix is able to significantly distinguish authentic text-types from ones that have been specifically simplified for second language readers. This paper offers important findings for text classification research and for second language reading materials developers and second language teachers by demonstrating that moderate, shallow, textual changes can affect discourse structures.
Subjects: 13. Natural Language Processing; 13.1 Discourse
Submitted: Jan 30, 2007