Autonomous agents require some form of goal and plan recognition to interact in multiagent settings. Unfortunately, all existing goal recognition datasets suffer from a systematical bias induced by the planning systems that generated them, namely heuristic-based forward search. This means that existing datasets lack enough challenge for more realistic scenarios (e.g., agents using different planners), which impacts the evaluation of goal recognisers with respect to using different planners for the same goal. In this paper, we propose a new method that uses top-k planning to generate multiple, different, plans for the same goal hypothesis, yielding benchmarks that mitigate the bias found in the current dataset. This allows us to introduce a new metric called Version Coverage Score (VCS) to measure the resilience of the goal recogniser when inferring a goal based on different sets of plans. Our results show that the resilience of the current state-of-the-art goal recogniser degrades substantially under low observability settings.
翻译:自主智能体在多智能体环境中交互时,需要具备某种形式的目标与规划识别能力。然而,现有所有目标识别数据集均受到其生成系统——即基于启发式的前向搜索规划器——所引入的系统性偏差影响。这意味着现有数据集在应对更现实场景(例如智能体使用不同规划器)时缺乏足够的挑战性,从而影响了目标识别器在使用不同规划器实现同一目标时的评估效果。本文提出一种新方法,利用top-k规划技术为同一目标假设生成多个不同的规划方案,从而构建能够缓解当前数据集偏差的基准测试集。基于此,我们提出一种称为版本覆盖分数(VCS)的新指标,用于衡量目标识别器在基于不同规划集进行目标推断时的鲁棒性。实验结果表明,当前最先进的目标识别器在低可观测性环境下的鲁棒性会出现显著下降。