Cascade-Aware Multi-Agent Routing: Spatio-Temporal Sidecars and Geometry-Switching

Advanced AI reasoning systems route tasks through dynamic execution graphs of specialized agents. We identify a structural blind spot in this architecture: schedulers optimize load and fitness but lack a model of how failure propagates differently in tree-like versus cyclic graphs. In tree-like regimes, a single failure cascades exponentially; in dense cyclic regimes, it self-limits. A geometry-blind scheduler cannot distinguish these cases. We formalize this observability gap as an online geometry-control problem. We prove a cascade-sensitivity condition: failure spread is supercritical when per-edge propagation probability exceeds the inverse of the graph's branching factor (p > e^{-γ}, where γis the BFS shell-growth exponent). We close this gap with a spatio-temporal sidecar that predicts which routing geometry fits the current topology. The sidecar comprises (i) a Euclidean propagation scorer for dense, cyclic subgraphs, (ii) a hyperbolic scorer capturing exponential risk in tree-like subgraphs, and (iii) a compact learned gate (133 parameters) that blends the two scores using topology and geometry-aware features. On 250 benchmark scenarios spanning five topology regimes, the sidecar lifts the native scheduler's win rate from 50.4% to 87.2% (+36.8 pp). In tree-like regimes, gains reach +48 to +68 pp. The learned gate achieves held-out AUC = 0.9247, confirming geometry preference is recoverable from live signals. Cross-architecture validation on Barabasi-Albert, Watts-Strogatz, and Erdos-Renyi graphs confirms propagation modeling generalizes across graph families.

翻译：高级AI推理系统通过由专用智能体构成的动态执行图来路由任务。我们发现了该架构中的一个结构性盲区：调度器在优化负载与适配度的过程中，缺乏对故障在树状图与循环图中传播差异性的建模能力。在树状结构中，单一故障会呈指数级级联放大；而在密集循环结构中，故障传播会自我限制。缺乏几何感知的调度器无法区分这两种情况。我们将该可观测性缺口形式化为在线几何控制问题，并证明了级联敏感条件：当单边传播概率超过图分支因子的倒数时（p > e^{-γ}，其中γ为BFS壳层生长指数），故障扩散将进入超临界状态。为弥补此缺口，我们提出一种时空辅助单元，可预测当前拓扑应匹配何种路由几何形态。该辅助单元包含：（i）用于密集循环子图的欧几里得传播评分器；（ii）用于树状子图双曲风险捕获的双曲评分器；（iii）一个紧凑的学习型门控单元（133个参数），通过拓扑与几何感知特征融合两类评分。在涵盖五种拓扑机制的250个基准场景中，该辅助单元将原生调度器的胜率从50.4%提升至87.2%（+36.8个百分点）。在树状结构中，增益达+48至+68个百分点。学习型门控单元在保留集上达到AUC=0.9247，证实几何偏好可从实时信号中恢复。基于Barabasi-Albert、Watts-Strogatz及Erdos-Renyi图的跨架构验证表明，传播建模可泛化至不同图族。