Human expectations arise from their understanding of others and the world. In the context of human-AI interaction, this understanding may not align with reality, leading to the AI agent failing to meet expectations and compromising team performance. Explicable planning, introduced as a method to bridge this gap, aims to reconcile human expectations with the agent's optimal behavior, facilitating interpretable decision-making. However, an unresolved critical issue is ensuring safety in explicable planning, as it could result in explicable behaviors that are unsafe. To address this, we propose Safe Explicable Planning (SEP), which extends the prior work to support the specification of a safety bound. The goal of SEP is to find behaviors that align with human expectations while adhering to the specified safety criterion. Our approach generalizes the consideration of multiple objectives stemming from multiple models rather than a single model, yielding a Pareto set of safe explicable policies. We present both an exact method, guaranteeing finding the Pareto set, and a more efficient greedy method that finds one of the policies in the Pareto set. Additionally, we offer approximate solutions based on state aggregation to improve scalability. We provide formal proofs that validate the desired theoretical properties of these methods. Evaluation through simulations and physical robot experiments confirms the effectiveness of our approach for safe explicable planning.
翻译:人类期望源于对他人和世界的理解。在人机交互的背景下,这种理解可能与现实不符,导致AI智能体未能满足期望,从而影响团队绩效。可解释规划作为一种弥合这一差距的方法被提出,旨在协调人类期望与智能体的最优行为,促进可解释的决策过程。然而,一个尚未解决的关键问题是确保可解释规划的安全性,因为它可能导致不安全的可解释行为。为解决这一问题,我们提出了安全可解释规划(SEP),该方法扩展了先前工作,支持指定安全边界。SEP的目标是在遵循指定安全准则的同时,寻找符合人类期望的行为。我们的方法将基于多个模型(而非单一模型)的多目标考量进行泛化,从而得到一组安全可解释策略的帕累托集。我们提出了两种方法:一种精确方法,保证找到帕累托集;一种更高效的贪心方法,用于寻找帕累托集中的某个策略。此外,我们还提供了基于状态聚合的近似解以提高可扩展性。我们给出了形式化证明,验证了这些方法所期望的理论性质。通过模拟实验和实体机器人实验评估,证实了我们的安全可解释规划方法的有效性。