Task Arithmetic yields a modular, scalable way to adapt foundation models. Combining multiple task vectors, however, can lead to cross-task interference, causing representation drift and degraded performance. Representation drift regularization provides a natural remedy to disentangle task vectors; however, existing approaches typically require external task data, conflicting with modularity and data availability constraints (e.g., privacy requirements). We propose a dataless approach by framing regularization against representation drift as a curvature matrix approximation problem. This allows us to leverage well-established techniques; in particular, we adopt Kronecker-Factored Approximate Curvature and obtain a practical regularizer that achieves state-of-the-art results in task addition and negation. Our method has constant complexity in the number of tasks and promotes robustness to task vector rescaling, eliminating the need for held-out tuning.
翻译:任务算术为适应基础模型提供了一种模块化、可扩展的方法。然而,合并多个任务向量可能导致跨任务干扰,引发表征漂移和性能下降。表征漂移正则化为解耦任务向量提供了自然的解决方案;然而,现有方法通常需要外部任务数据,这与模块化及数据可用性约束(例如隐私要求)相冲突。我们提出一种无数据方法,将针对表征漂移的正则化构建为曲率矩阵近似问题。这使我们能够利用成熟的技术;具体而言,我们采用克罗内克分解近似曲率,获得了一种实用的正则化器,在任务添加与否定任务中实现了最先进的结果。我们的方法在任务数量上具有恒定复杂度,并能增强对任务向量缩放的鲁棒性,从而无需保留集调优。