Multi-agent decision pipelines can outperform single agent workflows when complementarity holds, i.e., different agents bring unique information to the table to inform a final decision. We propose ComplLLM, a post-training framework based on decision theory that fine-tunes a decision-assistant LLM using complementary information as reward to output signals that complement existing agent decisions. We validate ComplLLM on synthetic and real-world tasks involving domain experts, demonstrating how the approach recovers known complementary information and produces plausible explanations of complementary signals to support downstream decision-makers.
翻译:当互补性成立时,即不同智能体为最终决策提供独特信息时,多智能体决策流程可以超越单智能体工作流。我们提出ComplLLM,这是一个基于决策理论的后训练框架,它利用互补信息作为奖励来微调一个决策辅助大语言模型,使其输出能够补充现有智能体决策的信号。我们在涉及领域专家的合成任务和现实世界任务上验证了ComplLLM,展示了该方法如何恢复已知的互补信息,并生成对互补信号的合理解释,以支持下游决策者。