How Many Pairwise Preferences Do We Need to Rank a Graph Consistently?
We consider the problem of optimal recovery of true ranking of n items from a randomly chosen subset of their pairwise preferences. It is well known that without any further assumption, one requires a sample size of Ω(n2) for the purpose. We analyze the problem with an additional structure of relational graph G([n],E) over the n items added with an assumption of locality: Neighboring items are similar in their rankings. Noting the preferential nature of the data, we choose to embed not the graph, but, its strong product to capture the pairwise node relationships. Furthermore, unlike existing literature that uses Laplacian embedding for graph based learning problems, we use a richer class of graph embeddings—orthonormal representations—that includes (normalized) Laplacian as its special case. Our proposed algorithm, Pref-Rank, predicts the underlying ranking using an SVM based approach using the chosen embedding of the product graph, and is the first to provide statistical consistency on two ranking losses: Kendall’s tau and Spearman’s footrule, with a required sample complexity of O(n2χ(G¯))⅔ pairs, χ(G¯) being the chromatic number of the complement graph G¯. Clearly, our sample complexity is smaller for dense graphs, with χ(G¯) characterizing the degree of node connectivity, which is also intuitive due to the locality assumption e.g. O(n4/3) for union of k-cliques, or O(n5/3) for random and power law graphs etc.—a quantity much smaller than the fundamental limit of Ω(n2) for large n. This, for the first time, relates ranking complexity to structural properties of the graph. We also report experimental evaluations on different synthetic and real-world datasets, where our algorithm is shown to outperform the state of the art methods.