Jaime Spacco, Titus Winters, Tom Payne
We present techniques for analyzing score matrices of unit tests outcomes from snapshots of CS2 student code throughout the development cycle. This analysis includes a technique for estimating the number of fundamentally different features in the unit tests, as well as a survey of which algorithms can best match human intuition when grouping tests into related clusters. Unlike previous investigations into topic clustering of score matrices, we successfully identify algorithms that perform with good accuracy on this task. We also discuss the data gathered by the Marmoset system, which has been used to collect over 100,000 snapshots of student programs and associated test results.
Subjects: 12. Machine Learning and Discovery; 1.3 Computer-Aided Education
Submitted: May 17, 2006