As Embodied AI systems move from research prototypes to real world deployments, they tend to evolve rapidly while remaining reliable under workload changes and partial failures. In practice, many deployments are only partially decoupled: middleware moves messages, but shared context and feedback semantics are implicit, causing interface drift, cross-module interference, and brittle recovery at scale. We present ANCHOR, a modular framework that makes decoupling and robustness explicit system-level primitives. ANCHOR separates (i) Canonical Records, an evolvable contract for the standardized shared state, from (ii) a communication bus for many-to-many dissemination and feedback-oriented coordination, forming an inspectable end-to-end loop. We validate closed-loop feasibility on a de-identified workflow instantiation, characterize latency distributions under varying payload sizes and publish rates, and demonstrate automatic stream resumption after hard crashes and restarts even with shared-memory loss. Overall, ANCHOR turns ad-hoc integration glue into explicit contracts, enabling controlled degradation under load and self-healing recovery for scalable deployment of closed-loop AI systems.
翻译:随着具身AI系统从研究原型转向实际部署,它们需要在快速演进的同时,在负载变化和局部故障下保持可靠性。实践中,许多部署仅实现部分解耦:中间件负责消息传递,但共享上下文与反馈语义往往是隐式的,这导致了接口漂移、跨模块干扰以及大规模场景下的脆弱恢复。我们提出了ANCHOR,一个将解耦与鲁棒性明确作为系统级原语的模块化框架。ANCHOR将(i)标准化共享状态的可演进契约——规范记录,与(ii)支持多对多分发及面向反馈协调的通信总线相分离,从而形成一个可检查的端到端闭环。我们在一个去标识化的工作流实例上验证了闭环可行性,分析了不同负载大小与发布速率下的延迟分布,并展示了即使在共享内存丢失的情况下,系统在硬崩溃与重启后仍能自动恢复数据流。总体而言,ANCHOR将临时性的集成粘合代码转化为明确的契约,使得闭环AI系统在负载下能够实现可控的性能降级与自愈恢复,从而支持可扩展的部署。