Paraphrase Diversification Using Counterfactual Debiasing

Sunghyun Park; Seung-won Hwang; Fuxiang Chen; Jaegul Choo; Jung-Woo Ha; Sunghun Kim; Jinyeong Yim

doi:10.1609/aaai.v33i01.33016883

Authors

Sunghyun Park Yonsei University
Seung-won Hwang Yonsei University
Fuxiang Chen NAVER Corp.
Jaegul Choo Korea University
Jung-Woo Ha NAVER Corp.
Sunghun Kim Hong Kong University of Science and Technology
Jinyeong Yim NAVER Corp.

DOI:

https://doi.org/10.1609/aaai.v33i01.33016883

Abstract

The problem of generating a set of diverse paraphrase sentences while (1) not compromising the original meaning of the original sentence, and (2) imposing diversity in various semantic aspects, such as a lexical or syntactic structure, is examined. Existing work on paraphrase generation has focused more on the former, and the latter was trained as a fixed style transfer, such as transferring from positive to negative sentiments, even at the cost of losing semantics. In this work, we consider style transfer as a means of imposing diversity, with a paraphrasing correctness constraint that the target sentence must remain a paraphrase of the original sentence. However, our goal is to maximize the diversity for a set of k generated paraphrases, denoted as the diversified paraphrase (DP) problem. Our key contribution is deciding the style guidance at generation towards the direction of increasing the diversity of output with respect to those generated previously. As pre-materializing training data for all style decisions is impractical, we train with biased data, but with debiasing guidance. Compared to state-of-the-art methods, our proposed model can generate more diverse and yet semantically consistent paraphrase sentences. That is, our model, trained with the MSCOCO dataset, achieves the highest embedding scores, .94/.95/.86, similar to state-of-the-art results, but with a lower mBLEU score (more diverse) by 8.73%.

Paraphrase Diversification Using Counterfactual Debiasing

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information

Developed By

Subscription