Contrastive learning has achieved impressive success in generation tasks to militate the "exposure bias" problem and discriminatively exploit the different quality of references. Existing works mostly focus on contrastive learning on the instance-level without discriminating the contribution of each word, while keywords are the gist of the text and dominant the constrained mapping relationships. Hence, in this work, we propose a hierarchical contrastive learning mechanism, which can unify hybrid granularities semantic meaning in the input text. Concretely, we first propose a keyword graph via contrastive correlations of positive-negative pairs to iteratively polish the keyword representations. Then, we construct intra-contrasts within instance-level and keyword-level, where we assume words are sampled nodes from a sentence distribution. Finally, to bridge the gap between independent contrast levels and tackle the common contrast vanishing problem, we propose an inter-contrast mechanism that measures the discrepancy between contrastive keyword nodes respectively to the instance distribution. Experiments demonstrate that our model outperforms competitive baselines on paraphrasing, dialogue generation, and storytelling tasks.
翻译:对比学习在生成任务中取得了显著成功,以缓解“曝光偏差”问题并区分性地利用不同质量的参考文本。现有工作大多集中于实例级别的对比学习,而未区分每个词的贡献,而关键词是文本的核心并主导着约束性映射关系。因此,在本工作中,我们提出了一种层次化对比学习机制,能够统一输入文本中的混合粒度语义。具体而言,我们首先通过正负样本对的对比相关性构建关键词图,以迭代优化关键词表示。随后,我们在实例级别和关键词级别内部构建内部对比,其中我们假设词是从句子分布中采样的节点。最后,为弥合独立对比层级之间的差距并解决常见的对比消失问题,我们提出了一种跨层级对比机制,该机制分别测量对比性关键词节点相对于实例分布的差异。实验表明,我们的模型在释义、对话生成和故事叙述任务上均优于现有竞争基线。