Human expectations stem from their knowledge about the others and the world. Where human-AI interaction is concerned, such knowledge may be inconsistent with the ground truth, resulting in the AI agent not meeting its expectations and degraded team performance. Explicable planning was previously introduced as a novel planning approach to reconciling human expectations and the agent's optimal behavior for more interpretable decision-making. One critical issue that remains unaddressed is safety in explicable planning since it can lead to explicable behaviors that are unsafe. We propose Safe Explicable Planning (SEP) to extend the prior work to support the specification of a safety bound. The objective of SEP is to search for behaviors that are close to the human's expectations while satisfying the bound on the agent's return, the safety criterion chosen in this work. We show that the problem generalizes multi-objective optimization and our formulation introduces a Pareto set. Under such a formulation, we propose a novel exact method that returns the Pareto set of safe explicable policies, a more efficient greedy method that returns one of the Pareto optimal policies, and approximate solutions for them based on the aggregation of states to further scalability. Formal proofs are provided to validate the desired theoretical properties of the exact and greedy methods. We evaluate our methods both in simulation and with physical robot experiments. Results confirm the validity and efficacy of our methods for safe explicable planning.
翻译:人类期望源于其对他者及世界的认知。在人机交互场景中,此类认知可能与实际情况存在偏差,导致AI智能体无法满足预期,进而降低团队协作效能。可解释规划作为调和人类期望与智能体最优行为的新型规划范式,旨在实现更具可解释性的决策过程。然而,该领域仍存在关键安全性问题亟待解决——追求可解释行为可能引发安全隐患。我们提出安全可解释规划(SEP),通过扩展已有研究框架,支持安全性边界的规范化定义。SEP的目标是在满足智能体回报约束(本文选定的安全准则)的前提下,搜索最接近人类期望的行为策略。研究表明该问题可归结为多目标优化范畴,且我们的形式化方法引入了帕累托解集。基于此理论框架,我们提出:可返回安全可解释策略帕累托集的精确求解方法、能获取单个帕累托最优策略的高效贪心算法,以及通过状态聚合提升可扩展性的近似求解方案。通过严格的形式化证明验证精确方法与贪心算法的理论特性,并在仿真环境与实体机器人实验中验证了方法的有效性与鲁棒性。实验结果表明,本方法在安全可解释规划任务中具有良好效度与优越性能。