Spectrum Tuning: Post-Training for Distributional Coverage and In-Context Steerability

Language model post-training has enhanced instruction-following and performance on many downstream tasks, but also comes with an often-overlooked cost on tasks with many possible valid answers. On many tasks such as creative writing, synthetic data generation, or steering to diverse preferences, models must cover an entire distribution of outputs, rather than a single correct answer. We characterize three desiderata for conditional distributional modeling: in-context steerability, valid output space coverage, and distributional alignment, and document across three model families how current post-training can reduce these properties. In particular, we disambiguate between two kinds of in-context learning: ICL for eliciting existing underlying knowledge or capabilities, and in-context steerability, where a model must use in-context information to override its priors and steer to a novel data generating distribution. To better evaluate and improve these desiderata, we introduce Spectrum Suite, a large-scale resource compiled from >40 data sources and spanning >90 tasks requiring models to steer to and match diverse distributions ranging from varied human preferences to numerical distributions and more. We find that while current post-training techniques elicit underlying capabilities and knowledge, they hurt models' ability to flexibly steer in-context. To mitigate these issues, we propose Spectrum Tuning, a post-training method using Spectrum Suite to improve steerability and distributional coverage. We find that Spectrum Tuning often improves over pretrained and typical instruction-tuned models, enhancing steerability, spanning more of the output space, and improving distributional alignment on held-out datasets.

翻译：语言模型的后训练增强了指令跟随能力及在下游任务上的表现，但往往忽视了在存在多种有效答案的任务上可能带来的代价。对于创意写作、合成数据生成或适配多样化偏好等任务，模型需要覆盖整个输出分布，而非单一正确答案。我们提出了条件分布建模的三个理想特性：上下文可操控性、有效输出空间覆盖度以及分布对齐，并通过三个模型系列验证了当前后训练方法如何削弱这些特性。特别地，我们区分了两种上下文学习：用于激发模型已有潜在知识或能力的上下文学习，以及上下文可操控性——模型必须利用上下文信息来覆盖其先验知识，并转向新的数据生成分布。为更好地评估和改进这些特性，我们提出了Spectrum Suite，这是一个从超过40个数据源编译的大规模资源库，涵盖超过90项要求模型适配并匹配多样化分布的任务，包括人类偏好、数值分布等。研究发现，当前后训练技术虽能激发模型的内在能力与知识，却损害了模型在上下文中灵活调控的能力。为缓解这些问题，我们提出频谱调优（Spectrum Tuning），这是一种利用Spectrum Suite进行后训练的方法，旨在提升可操控性与分布覆盖度。实验表明，频谱调优通常优于预训练模型和典型指令微调模型，在未知数据集上显著增强了可操控性、扩展了输出空间覆盖范围，并改善了分布对齐效果。