Automated Pattern Mining with a Scale Dimension

Jan M. Zytkow, Jan M. Zytkow

An important but neglected aspect of automated data mining is discovering patterns at different scale in the same data. Scale plays the role analogous to error. It can be used to focus the search for patterns on differences that exceed the given scale and to disregard those smaller. We introduce a discovery mechanism that applies to bi-variate data. It combines search for maxima and minima with search for regularities in the form of equations. Groups of detected patterns are recursively searched for patterns on their parameters. If the mechanism cannot find a regularity for all data, it uses patterns discovered from data to divide data into subsets, and explores recursively each subset. Detected patterns are subtracted from data and the search continues in the residua. Our mechanism seeks patterns at each scale. Applied at many scales and to many data sets, it seems explosive, but it terminates surprisingly fast because of data reduction and the requirements of pattern stability. We walk through an application on a half million datapoints, showing how our method leads to the discovery of many extrema, equations on their parameters, and equations that hold in subsets of data or in residua. Then we analyze the clues provide by the discovered regularities about phenomena in the environment in which the data have been gathered.


This page is copyrighted by AAAI. All rights reserved. Your use of this site constitutes acceptance of all of AAAI's terms and conditions and privacy policy.