Residual Invertible Spatio-Temporal Network for Video Super-Resolution

  • Xiaobin Zhu Beijing Technology and Business University
  • Zhuangzi Li Beijing Technology and Business University
  • Xiao-Yu Zhang Chinese Academy of Sciences
  • Changsheng Li University of Electronic Science and Technology of China
  • Yaqi Liu Chinese Academy of Sciences
  • Ziyu Xue Academy of Broadcasting Science

Abstract

Video super-resolution is a challenging task, which has attracted great attention in research and industry communities. In this paper, we propose a novel end-to-end architecture, called Residual Invertible Spatio-Temporal Network (RISTN) for video super-resolution. The RISTN can sufficiently exploit the spatial information from low-resolution to high-resolution, and effectively models the temporal consistency from consecutive video frames. Compared with existing recurrent convolutional network based approaches, RISTN is much deeper but more efficient. It consists of three major components: In the spatial component, a lightweight residual invertible block is designed to reduce information loss during feature transformation and provide robust feature representations. In the temporal component, a novel recurrent convolutional model with residual dense connections is proposed to construct deeper network and avoid feature degradation. In the reconstruction component, a new fusion method based on the sparse strategy is proposed to integrate the spatial and temporal features. Experiments on public benchmark datasets demonstrate that RISTN outperforms the state-ofthe-art methods.

Published
2019-07-17
Section
AAAI Technical Track: Machine Learning