Deep generative modeling of natural languages has achieved many successes, such as producing fluent sentences and translating from one language into another. However, the development of generative modeling techniques for paraphrase generation still lags behind largely due to the challenges in addressing the complex conflicts between expression diversity and semantic preservation. This paper proposes to generate diverse and high-quality paraphrases by exploiting the pre-trained models with instance-dependent prompts. To learn generalizable prompts, we assume that the number of abstract transforming patterns of paraphrase generation (governed by prompts) is finite and usually not large. Therefore, we present vector-quantized prompts as the cues to control the generation of pre-trained models. Extensive experiments demonstrate that the proposed method achieves new state-of-art results on three benchmark datasets, including Quora, Wikianswers, and MSCOCO. We will release all the code upon acceptance.
翻译:自然语言的深度生成建模已在流畅句子生成与跨语言翻译等领域取得诸多成功。然而,针对释义生成的生成式建模技术发展仍相对滞后,其主要原因在于处理表达多样性与语义保留之间复杂冲突的挑战。本文提出通过利用具有实例相关提示的预训练模型生成多样且高质量的释义。为学习可泛化的提示,我们假设释义生成中(由提示调控的)抽象变换模式数量有限且通常不大。因此,我们提出向量量化提示作为引导预训练模型生成的线索。大量实验表明,该方法在Quora、Wikianswers和MSCOCO三个基准数据集上均达到了新的最优结果。我们将在论文被接收后公开所有代码。