In this paper, we introduce a set of \textit{Linear Temporal Logic} (LTL) formulae designed to provide explanations for policies. Our focus is on crafting explanations that elucidate both the ultimate objectives accomplished by the policy and the prerequisites it upholds throughout its execution. These LTL-based explanations feature a structured representation, which is particularly well-suited for local-search techniques. The effectiveness of our proposed approach is illustrated through a simulated capture the flag environment. The paper concludes with suggested directions for future research.
翻译:本文提出一组线性时序逻辑公式,旨在为强化学习策略提供解释。我们重点设计能够阐明策略在执行过程中所实现的最终目标及其所维护的先行条件的解释方法。这些基于线性时序逻辑的解释具有结构化表征特性,尤其适用于局部搜索技术。通过模拟夺旗环境的实验验证了所提方法的有效性。文章最后提出了未来研究方向的建议。