For two integral histograms r=(r1,...,rd) and c=(c1,...,cd) of equal sum N, the Monge-Kantorovich distance dMK(r,c) between r and c parameterized by a d-times-d distance matrix T is the minimum of all costs <F,T> taken over matrices F of the transportation polytope U(r,c). Recent results suggest that this distance is not negative definite, and hence, through Schoenberg's well-known result, exp(-t dMK) may not be a positive definite kernel for all t > 0. Rather than using directly dMK to define a similarity between r and c, we propose in this paper to investigate kernels on r and c based on the whole transportation polytope U(r,c). We prove that when r and c have binary counts, which is equivalent to stating that r and c represent clouds of points of equal size, the permanent of an adequate Gram matrix induced by the distance matrix T is a positive definite kernel under favorable conditions on T. We also show that the volume of the polytope U(r,c), that is the number of integral transportation plans, is a positive definite quantity in r and c through the Robinson-Schensted-Knuth correspondence between transportation matrices and Young Tableaux. We follow by proposing a family of positive definite kernels related to the generating function of the polytope through recent results obtained separately by A. Barvinok on the one hand, and C.Berg and A.J. Duran on the other hand. We finally present preliminary results led on a subset of the MNIST database to compare clouds of points through the permanent kernel.
Subjects: 12. Machine Learning and Discoveryn
Submitted: Oct 16, 2006