In this paper, we propose DimonGen, which aims to generate diverse sentences describing concept relationships in various everyday scenarios. To support this, we first create a benchmark dataset for this task by adapting the existing CommonGen dataset. We then propose a two-stage model called MoREE to generate the target sentences. MoREE consists of a mixture of retrievers model that retrieves diverse context sentences related to the given concepts, and a mixture of generators model that generates diverse sentences based on the retrieved contexts. We conduct experiments on the DimonGen task and show that MoREE outperforms strong baselines in terms of both the quality and diversity of the generated sentences. Our results demonstrate that MoREE is able to generate diverse sentences that reflect different relationships between concepts, leading to a comprehensive understanding of concept relationships.
翻译:摘要:本文提出DimonGen,旨在生成描述各种日常场景中概念关系的多样化句子。为此,我们首先基于现有CommonGen数据集构建了该任务的基准数据集。随后提出名为MoREE的两阶段模型以生成目标句子。MoREE由混合检索器模型(可从给定概念中检索多样化上下文句子)与混合生成器模型(基于检索到的上下文生成多样化句子)组成。我们在DimonGen任务上开展实验,结果表明:在生成句子的质量与多样性方面,MoREE均显著优于强基线模型。实验结果证实,MoREE能够生成反映概念间不同关系的多样化句子,从而实现对概念关系的全面理解。