An Epistemic Human-Aware Task Planner which Anticipates Human Beliefs and Decisions

We present a substantial extension of our Human-Aware Task Planning framework, tailored for scenarios with intermittent shared execution experiences and significant belief divergence between humans and robots, particularly due to the uncontrollable nature of humans. Our objective is to build a robot policy that accounts for uncontrollable human behaviors, thus enabling the anticipation of possible advancements achieved by the robot when the execution is not shared, e.g. when humans are briefly absent from the shared environment to complete a subtask. But, this anticipation is considered from the perspective of humans who have access to an estimated model for the robot. To this end, we propose a novel planning framework and build a solver based on AND-OR search, which integrates knowledge reasoning, including situation assessment by perspective taking. Our approach dynamically models and manages the expansion and contraction of potential advances while precisely keeping track of when (and when not) agents share the task execution experience. The planner systematically assesses the situation and ignores worlds that it has reason to think are impossible for humans. Overall, our new solver can estimate the distinct beliefs of the human and the robot along potential courses of action, enabling the synthesis of plans where the robot selects the right moment for communication, i.e. informing, or replying to an inquiry, or defers ontic actions until the execution experiences can be shared. Preliminary experiments in two domains, one novel and one adapted, demonstrate the effectiveness of the framework.

翻译：本文对我们先前提出的人机协同任务规划框架进行了重要扩展，该框架专门针对具有间歇性共享执行经验以及人机间存在显著信念差异的场景，这种差异尤其源于人类行为的不可控性。我们的目标是构建一种能够应对不可控人类行为的机器人策略，从而在非共享执行（例如当人类短暂离开共享环境以完成子任务时）的情况下预测机器人可能实现的进展。然而，这种预测是从人类视角出发的，即人类拥有对机器人的估计模型。为此，我们提出了一种新颖的规划框架，并构建了基于AND-OR搜索的求解器，该求解器集成了知识推理，包括通过视角采择进行情境评估。我们的方法动态建模并管理潜在进展的扩展与收缩，同时精确追踪智能体何时（及何时不）共享任务执行经验。该规划器系统性地评估情境，并排除其有理由认为对人类而言不可能的世界。总体而言，我们新的求解器能够沿着潜在行动过程估计人类与机器人各自的信念，从而合成出使机器人能够选择恰当时机进行通信（即告知、回复询问）或推迟本体行动直至能够共享执行经验的规划方案。在两个领域（一个新颖领域与一个改编领域）的初步实验验证了该框架的有效性。