Meta Learning for Image Captioning

Nannan Li; Zhenzhong Chen; Shan Liu

doi:10.1609/aaai.v33i01.33018626

Authors

Nannan Li Wuhan University
Zhenzhong Chen Wuhan University
Shan Liu Tencent America

DOI:

https://doi.org/10.1609/aaai.v33i01.33018626

Abstract

Reinforcement learning (RL) has shown its advantages in image captioning by optimizing the non-differentiable metric directly in the reward learning process. However, due to the reward hacking problem in RL, maximizing reward may not lead to better quality of the caption, especially from the aspects of propositional content and distinctiveness. In this work, we propose to use a new learning method, meta learning, to utilize supervision from the ground truth whilst optimizing the reward function in RL. To improve the propositional content and the distinctiveness of the generated captions, the proposed model provides the global optimal solution by taking different gradient steps towards the supervision task and the reinforcement task, simultaneously. Experimental results on MS COCO validate the effectiveness of our approach when compared with the state-of-the-art methods.

Meta Learning for Image Captioning

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information

Developed By

Subscription