Regularizing Neural Machine Translation by Target-Bidirectional Agreement

  • Zhirui Zhang University of Science and Technology of China
  • Shuangzhi Wu Harbin Institute of Technology
  • Shujie Liu Microsoft Research Asia
  • Mu Li Microsoft Research Asia
  • Ming Zhou Microsoft Research
  • Tong Xu University of Science and Technology of China

Abstract

Although Neural Machine Translation (NMT) has achieved remarkable progress in the past several years, most NMT systems still suffer from a fundamental shortcoming as in other sequence generation tasks: errors made early in generation process are fed as inputs to the model and can be quickly amplified, harming subsequent sequence generation. To address this issue, we propose a novel model regularization method for NMT training, which aims to improve the agreement between translations generated by left-to-right (L2R) and right-to-left (R2L) NMT decoders. This goal is achieved by introducing two Kullback-Leibler divergence regularization terms into the NMT training objective to reduce the mismatch between output probabilities of L2R and R2L models. In addition, we also employ a joint training strategy to allow L2R and R2L models to improve each other in an interactive update process. Experimental results show that our proposed method significantly outperforms state-of-the-art baselines on Chinese-English and English-German translation tasks.

Published
2019-07-17
Section
AAAI Technical Track: AI and the Web