As multi-agent LLM pipelines grow in complexity, existing serving paradigms fail to adapt to the dynamic serving conditions. We argue that agentic serving systems should be programmable and system-aware, unlike existing serving which statically encode the parameters. In this work, we propose a new SDN-inspired agentic serving framework that helps control the key attributes of communication based on runtime state. This architecture enables serving-efficient, responsive agent systems and paves the way for high-level intent-driven agentic serving.
翻译:随着多智能体大语言模型流水线日益复杂,现有服务范式难以适应动态的服务条件。我们认为,与当前静态编码参数的服务方式不同,智能体服务系统应具备可编程性和系统感知能力。本工作提出一种受软件定义网络启发的新型智能体服务框架,该框架能够根据运行时状态控制通信的关键属性。此架构实现了服务高效、响应灵敏的智能体系统,并为高层次意图驱动的智能体服务开辟了道路。