Autonomous agents operating in dynamic and safety-critical environments require decision-making frameworks that are both computationally efficient and physically grounded. However, many existing approaches rely on end-to-end learning, which often lacks interpretability and explicit mechanisms for ensuring consistency with physical constraints. In this work, we propose an event-centric world modeling framework with memory-augmented retrieval for embodied decision-making. The framework represents the environment as a structured set of semantic events, which are encoded into a permutation-invariant latent representation. Decision-making is performed via retrieval over a knowledge bank of prior experiences, where each entry associates an event representation with a corresponding maneuver. The final action is computed as a weighted combination of retrieved solutions, providing a transparent link between decision and stored experiences. The proposed design enables structured abstraction of dynamic environments and supports interpretable decision-making through case-based reasoning. In addition, incorporating physics-informed knowledge into the retrieval process encourages the selection of maneuvers that are consistent with observed system dynamics. Experimental evaluation in UAV flight scenarios demonstrates that the framework operates within real-time control constraints while maintaining interpretable and consistent behavior.
翻译:在动态且安全性至关重要的环境中自主运行的代理需要既计算高效又物理可行的决策框架。然而,许多现有方法依赖于端到端学习,这往往缺乏可解释性以及确保与物理约束一致性的显式机制。本文提出了一种面向具身决策的基于记忆增强检索的事件中心世界建模框架。该框架将环境表示为结构化语义事件集合,这些事件被编码为排列不变的潜在表示。通过检索先验经验知识库进行决策,其中每个条目将事件表示与相应操作相关联。最终动作计算为检索解的加权组合,从而在决策与存储经验之间建立透明连接。所提出的设计能够实现动态环境的结构化抽象,并通过基于案例的推理支持可解释决策。此外,将物理信息知识融入检索过程有助于选择与观测系统动力学一致的操作。在无人机飞行场景中的实验评估表明,该框架能在满足实时控制约束的同时保持行为的可解释性与一致性。