Open-vocabulary state tracking is a more practical version of state tracking that aims to track state changes of entities throughout a process without restricting the state space and entity space. OpenPI is to date the only dataset annotated for open-vocabulary state tracking. However, we identify issues with the dataset quality and evaluation metric. For the dataset, we categorize 3 types of problems on the procedure level, step level and state change level respectively, and build a clean dataset OpenPI-C using multiple rounds of human judgment. For the evaluation metric, we propose a cluster-based metric to fix the original metric's preference for repetition. Model-wise, we enhance the seq2seq generation baseline by reinstating two key properties for state tracking: temporal dependency and entity awareness. The state of the world after an action is inherently dependent on the previous state. We model this dependency through a dynamic memory bank and allow the model to attend to the memory slots during decoding. On the other hand, the state of the world is naturally a union of the states of involved entities. Since the entities are unknown in the open-vocabulary setting, we propose a two-stage model that refines the state change prediction conditioned on entities predicted from the first stage. Empirical results show the effectiveness of our proposed model especially on the cluster-based metric. The code and data are released at https://github.com/shirley-wu/openpi-c
翻译:开放词汇状态追踪是状态追踪的一种更实用的变体,旨在追踪过程中实体的状态变化,而不限制状态空间和实体空间。OpenPI 是目前唯一标注用于开放词汇状态追踪的数据集。然而,我们发现了该数据集质量与评估指标方面的问题。针对数据集,我们分别在过程层级、步骤层级和状态变化层级归纳了3类问题,并经过多轮人工判断构建了清洁数据集 OpenPI-C。针对评估指标,我们提出了一种基于聚类的指标,以修正原指标对重复的偏好。在模型方面,我们通过恢复状态追踪的两个关键属性——时间依赖性与实体感知性——增强了序列到序列生成基线。动作后的世界状态本质上依赖于前一状态,我们通过动态记忆库对此依赖性进行建模,并让模型在解码时关注记忆槽位。另一方面,世界状态天然是所涉实体状态的并集。由于在开放词汇设定下实体未知,我们提出了一种两阶段模型,其基于第一阶段预测的实体来精炼状态变化预测。实验结果表明,我们提出的模型具有有效性,尤其在基于聚类的指标上表现突出。代码与数据已在 https://github.com/shirley-wu/openpi-c 发布。