Advancements in natural language generation (NLG) and large language models (LLMs) have led to proficient text generation in various tasks. However, integrating intricate constraints into neural text generation, due to LLMs' opacity, remains challenging. This study investigates constrained text generation for LLMs, where predefined constraints are applied during LLM's generation process. Our research examines multiple LLMs, including ChatGPT and GPT-4, categorizing constraints into lexical, structural, and relation-based types. We also present various benchmarks to facilitate fair evaluation. The study addresses some key research questions, including the extent of LLMs' compliance with constraints. Results illuminate LLMs' capacity and deficiency to incorporate constraints and provide insights for future developments in constrained text generation. Codes and datasets will be released upon acceptance.
翻译:自然语言生成(NLG)与大型语言模型(LLMs)的进步使得各类任务中的文本生成能力日趋成熟。然而,由于LLMs的黑箱特性,将复杂约束融入神经文本生成仍具挑战。本研究探讨了LLMs的约束文本生成问题,即在LLM生成过程中施加预定义约束。我们研究了包括ChatGPT和GPT-4在内的多种LLMs,并将约束类型划分为词汇约束、结构约束和关系约束三类。同时,我们提出多个基准测试以促进公平评估。本研究聚焦若干关键问题,包括LLMs对约束的遵循程度。研究结果揭示了LLMs在融入约束方面的能力与不足,为约束文本生成的未来发展提供了见解。代码与数据集将在录用后公开发布。