Unsupervised Feature Selection by Pareto Optimization

  • Chao Feng University of Science and Technology of China
  • Chao Qian University of Science and Technology of China
  • Ke Tang Southern University of Science and Technology

Abstract

Dimensionality reduction is often employed to deal with the data with a huge number of features, which can be generally divided into two categories: feature transformation and feature selection. Due to the interpretability, the efficiency during inference and the abundance of unlabeled data, unsupervised feature selection has attracted much attention. In this paper, we consider its natural formulation, column subset selection (CSS), which is to minimize the reconstruction error of a data matrix by selecting a subset of features. We propose an anytime randomized iterative approach POCSS, which minimizes the reconstruction error and the number of selected features simultaneously. Its approximation guarantee is well bounded. Empirical results exhibit the superior performance of POCSS over the state-of-the-art algorithms.

Published
2019-07-17
Section
AAAI Technical Track: Machine Learning