AAAI Publications, Twenty-Seventh AAAI Conference on Artificial Intelligence

Font Size: 
Walking on Minimax Paths for k-NN Search
Kye-Hyeon Kim, Seungjin Choi

Last modified: 2013-06-30


Link-based dissimilarity measures, such as shortest path or Euclidean commute time distance, base their distance on paths between nodes of a weighted graph. These measures are known to be better suited to data manifold with nonconvex-shaped clusters, compared to Euclidean distance, so that k-nearest neighbor (NN) search is improved in such metric spaces. In this paper we present a new link-based dissimilarity measure based on minimax paths between nodes. Two main benefits of minimax path-based dissimilarity measure are: (1) only a subset of paths is considered to make it scalable, while Euclidean commute time distance considers all possible paths; (2) it better captures nonconvex-shaped cluster structure, compared to shortest path distance. We define the total cost assigned to a path between nodes as Lp norm of intermediate costs of edges involving the path, showing that minimax path emerges from our Lp norm over paths framework. We also define minimax distance as the intermediate cost of the longest edge on the minimax path, then present a greedy algorithm to compute k smallest minimax distances between a query and N data points in O(log N + k log k) time. Numerical experiments demonstrate that our minimax k-NN algorithm reduce the search time by several orders of magnitude, compared to existing methods, while the quality of k-NN search is significantly improved over Euclidean distance.


k-Nearest Neighbor Search; Link-based Dissimilarity; Minimax Distance; Minimax Path

Full Text: PDF