High Dimensional Clustering with r-nets

  • Georgia Avarikioti ETH Zurich
  • Alain Ryser ETH Zurich
  • Yuyi Wang ETH Zurich
  • Roger Wattenhofer ETH Zurich

Abstract

Clustering, a fundamental task in data science and machine learning, groups a set of objects in such a way that objects in the same cluster are closer to each other than to those in other clusters. In this paper, we consider a well-known structure, so-called r-nets, which rigorously captures the properties of clustering. We devise algorithms that improve the runtime of approximating r-nets in high-dimensional spaces with1 and `2 metrics from, where . These algorithms are also used to improve a framework that provides approximate solutions to other high dimensional distance problems. Using this framework, several important related problems can also be solved efficiently, e.g.,pproximate kth-nearest neighbor distance-approximate Min-Max clustering,-approximate k-center clustering. In addition, we build an algorithm that-approximates greedy permutations in time O˜((dn+n2−α)·logΦ) where Φ is the spread of the input. This algorithm is used to -approximate k-center with the same time complexity.

Published
2019-07-17
Section
AAAI Technical Track: Machine Learning