Robust Deep Co-Saliency Detection with Group Semantic

Authors

  • Chong Wang Chinese Academy of Sciences
  • Zheng-Jun Zha University of Science and Technology of China
  • Dong Liu University of Science and Technology of China
  • Hongtao Xie University of Science and Technology of China

DOI:

https://doi.org/10.1609/aaai.v33i01.33018917

Abstract

High-level semantic knowledge in addition to low-level visual cues is essentially crucial for co-saliency detection. This paper proposes a novel end-to-end deep learning approach for robust co-saliency detection by simultaneously learning highlevel group-wise semantic representation as well as deep visual features of a given image group. The inter-image interaction at semantic-level as well as the complementarity between group semantics and visual features are exploited to boost the inferring of co-salient regions. Specifically, the proposed approach consists of a co-category learning branch and a co-saliency detection branch. While the former is proposed to learn group-wise semantic vector using co-category association of an image group as supervision, the latter is to infer precise co-salient maps based on the ensemble of group semantic knowledge and deep visual cues. The group semantic vector is broadcasted to each spatial location of multi-scale visual feature maps and is used as a top-down semantic guidance for boosting the bottom-up inferring of co-saliency. The co-category learning and co-saliency detection branches are jointly optimized in a multi-task learning manner, further improving the robustness of the approach. Moreover, we construct a new large-scale co-saliency dataset COCO-SEG to facilitate research of co-saliency detection. Extensive experimental results on COCO-SEG and a widely used benchmark Cosal2015 have demonstrated the superiority of the proposed approach as compared to the state-of-the-art methods.

Downloads

Published

2019-07-17

How to Cite

Wang, C., Zha, Z.-J., Liu, D., & Xie, H. (2019). Robust Deep Co-Saliency Detection with Group Semantic. Proceedings of the AAAI Conference on Artificial Intelligence, 33(01), 8917-8924. https://doi.org/10.1609/aaai.v33i01.33018917

Issue

Section

AAAI Technical Track: Vision