Coincidence Detection: A Fast Method for Discovering Higher-Order Correlations in Multidimensional Data

Evan W. Steeg, Derek A. Robinson, and Ed Willis

We present a novel, fast method for association-mining in high-dimensional datasets. Our Coincidence Detection method, which combines random sampling and Chernoff-Hoeffding bounds with a novel coding/binning scheme, avoids the exhaustive search, prior limits on the order k of discovered associations, and exponentially large parameter space of other methods. Tight theoretical bounds on the complexity of randomized algorithms are impossible without strong input distribution assumptions. However, we observe sublinear time, space and data complexity in tests on constructed artificial datasets and in real application to important problems in bioinformatics and drug discovery. After placing the method in historical and mathematical context, we describe the method, and present theoretical and empirical results on its complexity and error.

This page is copyrighted by AAAI. All rights reserved. Your use of this site constitutes acceptance of all of AAAI's terms and conditions and privacy policy.