The recently introduced Controlled Text Reduction (CTR) task isolates the text generation step within typical summarization-style tasks. It does so by challenging models to generate coherent text conforming to pre-selected content within the input text (``highlights''). This framing enables increased modularity in summarization-like tasks, allowing to couple a single CTR model with various content-selection setups and modules. However, there are currently no reliable CTR models, while the performance of the existing baseline for the task is mediocre, falling short of practical utility. Here, we address this gap by introducing a high-quality, open-source CTR model that tackles two prior key limitations: inadequate enforcement of the content-preservation constraint, and suboptimal silver training data. Addressing these, we amplify the content-preservation constraint in both training, via RL, and inference, via a controlled decoding strategy. Further, we substantially improve the silver training data quality via GPT-4 distillation. Overall, pairing the distilled dataset with the highlight-adherence strategies yields marked gains over the current baseline, of up to 30 ROUGE-L points, providing a reliable CTR model for downstream use.
翻译:最近提出的受控文本缩减(CTR)任务将文本生成步骤从典型的摘要式任务中分离出来。该任务通过挑战模型生成与输入文本中预选内容(“高亮部分”)一致的连贯文本,从而在摘要类任务中实现更高的模块化,允许将单个CTR模型与不同的内容选择设置和模块相结合。然而,目前尚无可靠的CTR模型,现有基线任务的性能表现平庸,无法满足实际应用需求。本文通过引入一个高质量的开源CTR模型来弥补这一空白,该模型解决了先前两个关键局限性:内容保留约束执行不足,以及次优的银标准训练数据。为解决这些问题,我们通过强化学习在训练中,并通过受控解码策略在推理中增强了内容保留约束。此外,我们通过GPT-4知识蒸馏显著提升了银标准训练数据的质量。总体而言,将蒸馏数据集与高亮遵循策略相结合,相比当前基线取得了显著提升(最高30个ROUGE-L评分点),为下游应用提供了可靠的CTR模型。