A Kernel Approach to Comparing Distributions

Arthur Gretton, Karsten Borgwardt, Malte Rasch, Bernhard Schoelkopf, Alex Smola

We describe a technique for comparing distributions without the need for density estimation as an intermediate step. Our approach relies on mapping the distributions into a Reproducing Kernel Hilbert Space. We apply this technique to construct a two-sample test, which is used for determining whether two sets of observations arise from the same distribution. We use this test in attribute matching for databases using the Hungarian marriage method, where it performs strongly. We also demonstrate excellent performance when comparing distributions over graphs, for which no alternative tests currently exist.

Subjects: 12. Machine Learning and Discovery; Please choose a second document classification

Submitted: Apr 23, 2007

This page is copyrighted by AAAI. All rights reserved. Your use of this site constitutes acceptance of all of AAAI's terms and conditions and privacy policy.