TimeROME-DLM: Temporal Causal Tracing and Low-Rank Inference-Time Knowledge Editing for Masked Diffusion Language Models

Masked diffusion language models (MDLMs) such as LLaDA now rival autoregressive (AR) LLMs, but every existing knowledge-editing and unlearning method (ROME, MEMIT, etc.) targets AR transformers and either makes assumptions that fail under iterative denoising, or requires gradient updates whose backward-pass activations cost tens of GB of extra VRAM and which collapse MDLMs at standard learning rates. We introduce TimeROME-DLM, the first training-free, gradient-free, inference-time knowledge-editing framework for MDLMs. It couples two components: a Temporal Indirect Effect (TIE) causal-tracing protocol that identifies, for each fact, the coordinate whose intervention most strongly drives the object prediction at later denoising steps; and a closed-form, low-rank residual edit memory that aggregates subject keys and target deltas across all forget facts and applies a single ridge-regularised update at that coordinate at every diffusion forward, with sparsification to limit utility spillover. Backbone weights stay frozen; only three hyperparameters (alpha, lambda, q) are tuned on a small validation split. On TOFU forget01 with TOFU-finetuned LLaDA-8B-Base, TimeROME-DLM cuts forget-set log-probability by roughly 83 nats. The same configuration transfers to LLaDA-8B-Instruct, Dream-7B, MMaDA-8B, DiffuLLaMA-7B, and LLaDA-MoE-1.4B. It keeps retain-set log-probability nearly flat (within ~1 nat at the utility-safe operating point) across 50 sequentially inserted facts, delivers a four- to fourteen-fold wall-clock speedup with zero additional VRAM over the strongest converged training-time baseline, and scales sub-linearly to 400 facts. TimeROME-DLM closes the locate-then-edit gap between AR LLMs and MDLMs at a fraction of the computational cost.

翻译：掩码扩散语言模型（如LLaDA）现已能与自回归大语言模型相匹敌，但现有知识编辑与遗忘方法（如ROME、MEMIT等）均针对自回归Transformer设计，要么基于不适用于迭代去噪过程的假设，要么需要依赖梯度更新（其反向传播激活会消耗数十GB额外显存，且标准学习率下会导致掩码扩散语言模型崩溃）。我们提出TimeROME-DLM——首个面向掩码扩散语言模型的免训练、免梯度、推理时知识编辑框架。该框架包含两个组件：时间间接效应因果追踪协议，用于识别每个事实中在后期去噪步骤对目标词预测影响最强的坐标；以及一种闭式低秩残差编辑记忆机制，该机制聚合所有遗忘事实的主语键与目标差值，在每次扩散前向传播时对目标坐标施加单一岭回归正则化更新，并通过稀疏化限制效用溢出。模型骨干权重完全冻结，仅需在小型验证集上调整三个超参数（alpha、lambda、q）。在基于TOFU微调的LLaDA-8B-Base模型上进行TOFU forget01实验时，TimeROME-DLM将遗忘集对数概率降低约83纳特。该配置可直接迁移至LLaDA-8B-Instruct、Dream-7B、MMaDA-8B、DiffuLLaMA-7B及LLaDA-MoE-1.4B。连续插入50个事实时，它能使保留集对数概率保持近乎不变（在效用安全操作点附近波动幅度仅约1纳特），相比最强的收敛训练基线方法实现4至14倍实际加速且无需额外显存，并支持亚线性扩展至400个事实。TimeROME-DLM以极低计算成本填补了自回归大语言模型与掩码扩散语言模型之间的“定位-编辑”鸿沟。