We introduce AmbigNLG, a novel task designed to tackle the challenge of task ambiguity in instructions for Natural Language Generation (NLG). Ambiguous instructions often impede the performance of Large Language Models (LLMs), especially in complex NLG tasks. To tackle this issue, we propose an ambiguity taxonomy that categorizes different types of instruction ambiguities and refines initial instructions with clearer specifications. Accompanying this task, we present AmbigSNI-NLG, a dataset comprising 2,500 instances annotated to facilitate research in AmbigNLG. Through comprehensive experiments with state-of-the-art LLMs, we demonstrate that our method significantly enhances the alignment of generated text with user expectations, achieving up to a 15.02-point increase in ROUGE scores. Our findings highlight the critical importance of addressing task ambiguity to fully harness the capabilities of LLMs in NLG tasks. Furthermore, we confirm the effectiveness of our method in practical settings involving interactive ambiguity mitigation with users, underscoring the benefits of leveraging LLMs for interactive clarification.
翻译:我们提出了AmbigNLG这一新颖任务,旨在解决自然语言生成(NLG)任务指令中存在的模糊性挑战。模糊指令往往会阻碍大型语言模型(LLMs)的性能表现,尤其在复杂的NLG任务中更为明显。为应对这一问题,我们提出了一种模糊性分类体系,用于对不同类型的指令模糊性进行分类,并通过更明确的规范对初始指令进行优化。伴随该任务,我们发布了AmbigSNI-NLG数据集,该数据集包含2500个标注实例,以促进AmbigNLG相关研究。通过对前沿LLMs进行系统实验,我们证明该方法能显著提升生成文本与用户期望的匹配度,在ROUGE指标上最高可获得15.02分的提升。我们的研究结果凸显了解决任务模糊性对于充分发挥LLMs在NLG任务中能力的关键作用。此外,我们在涉及用户交互式模糊消解的实际场景中验证了该方法的有效性,这进一步证实了利用LLMs进行交互式澄清的显著优势。