Knowledge editing emerges as a crucial technique for efficiently correcting incorrect or outdated knowledge in large language models (LLM). Existing editing methods for unimodal LLM rely on a rigid parameter-to-output mapping, which causes causal-underfit and causal-overfit in cascaded reasoning for Multimodal LLM (MLLM). In this paper, we reformulate MLLM editing as an out-of-distribution (OOD) generalization problem, where the goal is to discern semantic shift with factual shift and thus achieve robust editing among diverse cross-modal prompting. The key challenge of this OOD problem lies in identifying invariant causal trajectories that generalize accurately while suppressing spurious correlations. To address it, we propose ODEdit, a plug-and-play invariant learning based framework that optimizes the tripartite OOD risk objective to simultaneously enhance editing reliability, locality, and generality.We further introduce an edit trajectory invariant learning method, which integrates a total variation penalty into the risk minimization objective to stabilize edit trajectories against environmental variations. Theoretical analysis and extensive experiments demonstrate the effectiveness of ODEdit.
翻译:知识编辑作为一种高效修正大语言模型中错误或过时知识的关键技术而兴起。现有的单模态大语言模型编辑方法依赖于僵化的参数到输出的映射关系,这会导致多模态大语言模型在级联推理中出现因果欠拟合与因果过拟合。本文将多模态大语言模型编辑重新表述为一个分布外泛化问题,其目标在于区分语义偏移与事实偏移,从而在多样化的跨模态提示中实现稳健的编辑。该分布外泛化问题的核心挑战在于识别能够准确泛化同时抑制虚假相关性的不变因果轨迹。为此,我们提出了ODEdit——一个基于即插即用不变性学习的框架,该框架通过优化三元分布外风险目标,以同时提升编辑的可靠性、局部性与泛化性。我们进一步引入了一种编辑轨迹不变性学习方法,该方法将全变差惩罚项整合到风险最小化目标中,以稳定编辑轨迹对抗环境变化。理论分析与大量实验证明了ODEdit的有效性。