User alignment is crucial for adapting general-purpose language models (LMs) to downstream tasks, but human annotations are often not available for all types of instructions, especially those with customized constraints. We observe that user instructions typically contain constraints. While assessing response quality in terms of the whole instruction is often costly, efficiently evaluating the satisfaction rate of constraints is feasible. We investigate common constraints in NLP tasks, categorize them into three classes based on the types of their arguments, and propose a unified framework, ACT (Aligning to ConsTraints), to automatically produce supervision signals for user alignment with constraints. Specifically, ACT uses constraint verifiers, which are typically easy to implement in practice, to compute constraint satisfaction rate (CSR) of each response. It samples multiple responses for each prompt and collect preference labels based on their CSR automatically. Subsequently, ACT adapts the LM to the target task through a ranking-based learning process. Experiments on fine-grained entity typing, abstractive summarization, and temporal question answering show that ACT is able to enhance LMs' capability to adhere to different classes of constraints, thereby improving task performance. Further experiments show that the constraint-following capabilities are transferable.
翻译:用户对齐对于将通用语言模型(LM)适配下游任务至关重要,但针对所有类型的指令(尤其是具有定制化约束的指令)往往缺乏人工标注。我们观察到用户指令通常包含约束条件:虽然基于完整指令评估响应质量的成本较高,但高效评估约束满足率具有可行性。本文研究了自然语言处理任务中的常见约束,根据参数类型将其分为三类,并提出统一框架ACT(对齐约束框架),该框架可自动生成用于用户约束对齐的监督信号。具体而言,ACT利用约束验证器(实践中通常易于实现)计算每条响应的约束满足率(CSR),为每个提示自动采样多条响应并基于其CSR生成偏好标签。随后ACT通过基于排序的学习过程将LM适配至目标任务。在细粒度实体分类、抽象式摘要和时间敏感问答任务上的实验表明,ACT能够增强LM遵循不同类别约束的能力,从而提升任务性能。进一步实验证明,约束遵循能力具有可迁移性。