Recent work on model editing using Rank-One Model Editing (ROME), a popular model editing method, has shown that there are certain facts that the algorithm is unable to edit without breaking the model. Such edits have previously been called disabling edits. These disabling edits cause immediate model collapse and limits the use of ROME for sequential editing. In this paper, we make two main contributions. Firstly, we show that model collapse with ROME only happens when making edits using the CounterFact dataset and does not happen when using the zsRE dataset. Secondly, we find that disabling edits are an artifact of the original implementation of ROME. With this paper, we provide a more stable implementation ROME, which we call r-ROME and show that we no longer observe model collapse when making large scale sequential edits with ROME.
翻译:近期关于使用秩一模型编辑(ROME)这一流行模型编辑方法的研究表明,存在某些事实该算法无法在不破坏模型的情况下进行编辑。此类编辑先前被称为禁用编辑。这些禁用编辑会导致模型立即崩溃,并限制了ROME在序贯编辑中的应用。本文的主要贡献有两点:首先,我们证明ROME的模型崩溃仅在使用CounterFact数据集进行编辑时发生,而使用zsRE数据集时不会出现;其次,我们发现禁用编辑是ROME原始实现中的产物。本文提供了更稳定的ROME实现,称为r-ROME,并表明在使用ROME进行大规模序贯编辑时不再观测到模型崩溃现象。