Thomas W.H. Lui, David K.Y. Chiu
To facilitate more meaningful interpretation considering the internal interdependency relationships between data values, a new form of high-order (multiple-valued) pattern known as Nested High-Order Pattern (or NHOP) is recently proposed. This pattern satisfies a consistent statistical criterion when the pattern is iteratively extracted. The general form of High-Order Pattern (HOP), that NHOP is a subtype, is a set of multiple associated values (identified as variable outcomes) extracted from a random N-tuple. The pattern is detected by statistical testing if the occurrence is significantly deviated from the expected according to a prior model or null hypothesis. Here we extend our work of NHOP to the classification task. The rationale is that, meaningful association patterns, involving multiple values jointly and at the same time predict classification, can reinforce the underlying regularity, and hence provide a better understanding of the data domain. In this paper, we propose a Classification method based on the Nested High-Order Patterns (C-NHOP). In evaluating our method using 26 UCI benchmark datasets, the experiments show a highly competitive and interpretable result.
Subjects: 12. Machine Learning and Discovery; Please choose a second document classification
Submitted: Feb 7, 2008