Graph-based Multi-Agent Systems (MAS) enable complex cyclic workflows but suffer from inefficient static model allocation, where deploying strong models uniformly wastes computation on trivial sub-tasks. We propose CASTER (Context-Aware Strategy for Task Efficient Routing), a lightweight router for dynamic model selection in graph-based MAS. CASTER employs a Dual-Signal Router that combines semantic embeddings with structural meta-features to estimate task difficulty. During training, the router self-optimizes through a Cold Start to Iterative Evolution paradigm, learning from its own routing failures via on-policy negative feedback. Experiments using LLM-as-a-Judge evaluation across Software Engineering, Data Analysis, Scientific Discovery, and Cybersecurity demonstrate that CASTER reduces inference cost by up to 72.4% compared to strong-model baselines while matching their success rates, and consistently outperforms both heuristic routing and FrugalGPT across all domains.
翻译:基于图的多智能体系统(MAS)能够实现复杂的循环工作流,但其静态模型分配策略存在效率低下的问题——统一部署强大模型会导致在简单子任务上浪费算力。本文提出CASTER(面向任务高效路由的上下文感知策略),一种用于基于图的MAS中动态模型选择的轻量级路由器。CASTER采用双信号路由机制,通过融合语义嵌入与结构元特征来评估任务难度。在训练阶段,路由器通过"冷启动-迭代进化"范式实现自我优化,利用基于策略的负反馈从自身路由失败中学习。在软件工程、数据分析、科学发现和网络安全四大领域采用LLM-as-a-Judge评估方法的实验表明:与强模型基线相比,CASTER在保持同等成功率的同时将推理成本降低达72.4%,并且在所有领域均持续优于启发式路由和FrugalGPT方法。