Agentic AI systems act through tools and evolve their behavior over long, stochastic interaction traces. This setting complicates assurance, because behavior depends on nondeterministic environments and probabilistic model outputs. Prior work introduced runtime verification for agentic AI via Dynamic Probabilistic Assurance (DPA), learning an MDP online and model checking quantitative properties. A key limitation is that developers must manually define the state abstraction, which couples verification to application-specific heuristics and increases adoption friction. This paper proposes TriCEGAR, a trace-driven abstraction mechanism that automates state construction from execution logs and supports online construction of an agent behavioral MDP. TriCEGAR represents abstractions as predicate trees learned from traces and refined using counterexamples. We describe a framework-native implementation that (i) captures typed agent lifecycle events, (ii) builds abstractions from traces, (iii) constructs an MDP, and (iv) performs probabilistic model checking to compute bounds such as Pmax(success) and Pmin(failure). We also show how run likelihoods enable anomaly detection as a guardrailing signal.
翻译:智能体AI系统通过工具执行动作,并在长期、随机的交互轨迹中演化其行为。这种设定使得行为保障变得复杂,因为行为依赖于非确定性环境和概率性模型输出。先前的研究通过动态概率保障(DPA)为智能体AI引入了运行时验证,该方法在线学习马尔可夫决策过程(MDP)并对量化属性进行模型检验。一个关键局限在于,开发者必须手动定义状态抽象,这导致验证与特定于应用的启发式方法相耦合,并增加了采用难度。本文提出TriCEGAR,一种轨迹驱动的抽象机制,能够从执行日志中自动构建状态,并支持在线构建智能体行为MDP。TriCEGAR将抽象表示为从轨迹中学习并通过反例进行精化的谓词树。我们描述了一个框架原生实现,该实现能够(i)捕获类型化的智能体生命周期事件,(ii)从轨迹构建抽象,(iii)构造MDP,以及(iv)执行概率模型检验以计算诸如Pmax(成功)和Pmin(失败)等边界。我们还展示了如何利用运行似然度实现异常检测,作为一种护栏信号。