TAPAS: Train-Less Accuracy Predictor for Architecture Search

  • R. Istrate IBM Research - Zurich
  • F. Scheidegger IBM Research - Zurich
  • G. Mariani IBM Research - Zurich
  • D. Nikolopoulos Queens University of Belfast
  • C. Bekas IBM Research
  • A. C. I. Malossi IBM Research - Zurich

Abstract

In recent years an increasing number of researchers and practitioners have been suggesting algorithms for large-scale neural network architecture search: genetic algorithms, reinforcement learning, learning curve extrapolation, and accuracy predictors. None of them, however, demonstrated highperformance without training new experiments in the presence of unseen datasets. We propose a new deep neural network accuracy predictor, that estimates in fractions of a second classification performance for unseen input datasets, without training. In contrast to previously proposed approaches, our prediction is not only calibrated on the topological network information, but also on the characterization of the dataset-difficulty which allows us to re-tune the prediction without any training. Our predictor achieves a performance which exceeds 100 networks per second on a single GPU, thus creating the opportunity to perform large-scale architecture search within a few minutes. We present results of two searches performed in 400 seconds on a single GPU. Our best discovered networks reach 93.67% accuracy for CIFAR-10 and 81.01% for CIFAR-100, verified by training. These networks are performance competitive with other automatically discovered state-of-the-art networks however we only needed a small fraction of the time to solution and computational resources.

Published
2019-07-17
Section
AAAI Technical Track: Machine Learning