Eleazar Eskin and Matt Bogosian
In this paper we present two improvements to traditional machine learning text classifiers. The first improvement we present is a decomposition of the classification space into several dimensions of categories. This breaks down the categorization problem into smaller more manageable parts. We discuss when decomposition is useful. The second improvement is to incorporate linguistically motivated indicators to supplement the classification. These indicators provide information about the structure of the document which are used to improve the classification accuracy.