Deep Embedding Features for Salient Object Detection

Authors

  • Yunzhi Zhuge Dalian University of Technology
  • Yu Zeng Dalian University of Technology
  • Huchuan Lu Dalian University of Technology

DOI:

https://doi.org/10.1609/aaai.v33i01.33019340

Abstract

Benefiting from the rapid development of Convolutional Neural Networks (CNNs), some salient object detection methods have achieved remarkable results by utilizing multi-level convolutional features. However, the saliency training datasets is of limited scale due to the high cost of pixel-level labeling, which leads to a limited generalization of the trained model on new scenarios during testing. Besides, some FCN-based methods directly integrate multi-level features, ignoring the fact that the noise in some features are harmful to saliency detection. In this paper, we propose a novel approach that transforms prior information into an embedding space to select attentive features and filter out outliers for salient object detection. Our network firstly generates a coarse prediction map through an encorder-decorder structure. Then a Feature Embedding Network (FEN) is trained to embed each pixel of the coarse map into a metric space, which incorporates much attentive features that highlight salient regions and suppress the response of non-salient regions. Further, the embedded features are refined through a deep-to-shallow Recursive Feature Integration Network (RFIN) to improve the details of prediction maps. Moreover, to alleviate the blurred boundaries, we propose a Guided Filter Refinement Network (GFRN) to jointly optimize the predicted results and the learnable guidance maps. Extensive experiments on five benchmark datasets demonstrate that our method outperforms state-of-the-art results. Our proposed method is end-to-end and achieves a realtime speed of 38 FPS.

Downloads

Published

2019-07-17

How to Cite

Zhuge, Y., Zeng, Y., & Lu, H. (2019). Deep Embedding Features for Salient Object Detection. Proceedings of the AAAI Conference on Artificial Intelligence, 33(01), 9340-9347. https://doi.org/10.1609/aaai.v33i01.33019340

Issue

Section

AAAI Technical Track: Vision