Autopoiesis: A Self-Evolving System Paradigm for LLM Serving Under Runtime Dynamics

Modern Large Language Model (LLM) serving operates in highly volatile environments characterized by severe runtime dynamics, such as workload fluctuations and elastic cluster autoscaling. Traditional serving systems rely on static, human-engineered serving policies (e.g., scheduling algorithms and rescheduling strategies) to manage these dynamics. However, these policies must navigate deeply intertwined runtime trade-offs (e.g., scheduling overhead vs. execution efficiency, rescheduling frequency vs. reconfiguration overhead), whose optimal balance is workload-specific and shifts continuously as runtime conditions evolve, rendering any fixed policy fundamentally unable to adapt. We propose Autopoiesis, a novel online self-evolving system that shifts LLM serving from static policy deployment to continuous online policy evolution. First, Autopoiesis introduces an LLM-driven program synthesis workflow to evolve serving policies with respect to real-time observed dynamics, where the evolved policies reflect the optimal decision in navigating the complex, multi-dimensional trade-off space. Second, Autopoiesis enables this synthesis process to operate continuously during serving, observing real-world system behavior, and rewriting the policy code as runtime trade-offs shift, thereby transforming policy design from a one-time offline endeavor into an ongoing system component, enabling autonomous adaptation to evolving runtime conditions. Together, we establish a new paradigm: Serving policies are no longer static artifacts designed by humans before deployment, but living code that LLMs continuously evolve throughout deployment to navigate runtime trade-offs beyond human design. We evaluate Autopoiesis across diverse runtime dynamics and show up to 53% and on average 34% improvements over state-of-the-art LLM serving systems.

翻译：现代大语言模型（LLM）服务运行于高度动态的环境中，其特点是存在严重的运行时动态变化，例如工作负载波动和弹性集群自动伸缩。传统服务系统依赖静态的、人工设计的服务策略（例如调度算法与重调度策略）来管理这些动态变化。然而，这些策略必须应对深度交织的运行时权衡（例如调度开销与执行效率、重调度频率与重配置开销之间的权衡），其最优平衡点具有工作负载特异性，且随运行时条件变化而持续迁移，导致任何固定策略从根本上无法适应。我们提出Autopoiesis——一种新颖的在线自演化系统，将LLM服务从静态策略部署转变为持续在线策略演化。首先，Autopoiesis引入一种由LLM驱动的程序合成工作流，以实时观测到的动态变化为目标演化服务策略，其中演化出的策略反映了在复杂多维权衡空间中导航时的最优决策。其次，Autopoiesis使该合成过程能够在服务过程中持续运行，通过观测真实系统行为，并在运行时权衡偏移时重写策略代码，从而将策略设计从一次性的离线工作转变为持续性的系统组件，实现系统对演化运行时条件的自主适应。综上所述，我们建立了一种新范式：服务策略不再是部署前由人类设计的静态产物，而是由LLM在部署过程中持续演化的活代码，以驾驭超出人类设计能力的运行时权衡。我们在多样化的运行时动态条件下评估了Autopoiesis，实验表明相较于最先进的LLM服务系统，其性能提升最高达53%，平均提升达34%。