学习如何对存在量化目标进行实例化 (Learning to Ground Existentially Quantified Goals)

from arxiv, 11 pages, Accepted at the 21st International Conference on Principles of Knowledge Representation and Reasoning (KR2024) in the Reasoning, Learning, and Decision Making track

Goal instructions for autonomous AI agents cannot assume that objects have unique names. Instead, objects in goals must be referred to by providing suitable descriptions. However, this raises problems in both classical planning and generalized planning. The standard approach to handling existentially quantified goals in classical planning involves compiling them into a DNF formula that encodes all possible variable bindings and adding dummy actions to map each DNF term into the new, dummy goal. This preprocessing is exponential in the number of variables. In generalized planning, the problem is different: even if general policies can deal with any initial situation and goal, executing a general policy requires the goal to be grounded to define a value for the policy features. The problem of grounding goals, namely finding the objects to bind the goal variables, is subtle: it is a generalization of classical planning, which is a special case when there are no goal variables to bind, and constraint reasoning, which is a special case when there are no actions. In this work, we address the goal grounding problem with a novel supervised learning approach. A GNN architecture, trained to predict the cost of partially quantified goals over small domain instances is tested on larger instances involving more objects and different quantified goals. The proposed architecture is evaluated experimentally over several planning domains where generalization is tested along several dimensions including the number of goal variables and objects that can bind such variables. The scope of the approach is also discussed in light of the known relationship between GNNs and C2 logics.

翻译：自主人工智能代理的目标指令不能假设对象具有唯一名称。相反，必须通过提供适当的描述来指代目标中的对象。然而，这在经典规划和广义规划中都引发了问题。处理经典规划中存在量化目标的标准方法涉及将其编译为编码所有可能变量绑定的析取范式公式，并添加虚拟动作将每个析取范式项映射到新的虚拟目标。这种预处理在变量数量上是指数级的。在广义规划中，问题则有所不同：即使通用策略能够处理任何初始情境和目标，执行通用策略也需要对目标进行实例化以定义策略特征的值。目标实例化问题——即寻找绑定目标变量的对象——是微妙的：它是经典规划的泛化（当没有需要绑定的目标变量时即为经典规划的特殊情形），也是约束推理的泛化（当没有动作时即为约束推理的特殊情形）。在本研究中，我们采用一种新颖的监督学习方法来解决目标实例化问题。通过训练图神经网络架构来预测小型领域实例上部分量化目标的成本，并在涉及更多对象和不同量化目标的大型实例上进行测试。所提出的架构在多个规划领域进行了实验评估，测试了包括目标变量数量和可绑定此类变量的对象数量在内的多个维度的泛化能力。最后，结合图神经网络与C2逻辑之间的已知关系，讨论了该方法的适用范围。