Frontier language models are deployed as black-box services, where model weights cannot be modified and customization is limited to prompting. We introduce Advisor Models, a method to train small open-weight models to generate dynamic, per-instance natural language advice that improves the capabilities of black-box frontier models. Advisor Models improve GPT-5's performance on RuleArena (Taxes) by 71%, reduce Gemini 3 Pro's steps taken in SWE agent tasks by 24.6%, and outperform static prompt optimizers in personalizing GPT-5 to user preferences (85-100% vs. 40-60%). We also find that advisors are transferable: an advisor trained with a low-cost student model still transfers improvements to a frontier model. Moreover, Advisor Models are robust: we observe no degradation on other benchmarks than the pipeline is trained on. Our method shows how to perform parametric optimization for black-box frontier models in a practical and cost-effective way.
翻译:前沿语言模型通常以黑盒服务形式部署,其模型权重无法修改,定制化仅限于提示工程。我们提出顾问模型方法,通过训练小型开源权重模型来生成动态的、针对每个实例的自然语言建议,从而提升黑盒前沿模型的能力。顾问模型将GPT-5在RuleArena(税收)任务上的性能提升71%,使Gemini 3 Pro在SWE智能体任务中的步骤数减少24.6%,并在个性化GPT-5适应用户偏好方面超越静态提示优化器(85-100% vs. 40-60%)。我们还发现顾问模型具有可迁移性:使用低成本学生模型训练的顾问模型仍能将改进迁移至前沿模型。此外,顾问模型具有鲁棒性:在训练流程未涉及的其他基准测试中未观察到性能下降。本方法展示了如何以实用且经济高效的方式对黑盒前沿模型进行参数化优化。