Human-Like Delicate Region Erasing Strategy for Weakly Supervised Detection
We explore a principle method to address the weakly supervised detection problem. Many deep learning methods solve weakly supervised detection by mining various object proposal or pooling strategies, which may cause redundancy and generate a coarse location. To overcome this limitation, we propose a novel human-like active searching strategy that recurrently ignores the background and discovers class-specific objects by erasing undesired pixels from the image. The proposed detector acts as an agent, providing guidance to erase unremarkable regions and eventually concentrating the attention on the foreground. The proposed agents, which are composed of a deep Q-network and are trained by the Q-learning algorithm, analyze the contents of the image features to infer the localization action according to the learned policy. To the best of our knowledge, this is the first attempt to apply reinforcement learning to address weakly supervised localization with only image-level labels. Consequently, the proposed method is validated on the PASCAL VOC 2007 and PASCAL VOC 2012 datasets. The experimental results show that the proposed method is capable of locating a single object within 5 steps and has great significance to the research on weakly supervised localization with a human-like mechanism.