Changing how pre-trained models behave -- e.g., improving their performance on a downstream task or mitigating biases learned during pre-training -- is a common practice when developing machine learning systems. In this work, we propose a new paradigm for steering the behavior of neural networks, centered around \textit{task vectors}. A task vector specifies a direction in the weight space of a pre-trained model, such that movement in that direction improves performance on the task. We build task vectors by subtracting the weights of a pre-trained model from the weights of the same model after fine-tuning on a task. We show that these task vectors can be modified and combined together through arithmetic operations such as negation and addition, and the behavior of the resulting model is steered accordingly. Negating a task vector decreases performance on the target task, with little change in model behavior on control tasks. Moreover, adding task vectors together can improve performance on multiple tasks at once. Finally, when tasks are linked by an analogy relationship of the form ``A is to B as C is to D", combining task vectors from three of the tasks can improve performance on the fourth, even when no data from the fourth task is used for training. Overall, our experiments with several models, modalities and tasks show that task arithmetic is a simple, efficient and effective way of editing models.
翻译:改变预训练模型的行为——例如提升其在特定下游任务上的表现或缓解预训练过程中学到的偏见——是开发机器学习系统时的常见实践。在这项工作中,我们提出了一种以**任务向量**为核心的新型范式来引导神经网络的行为。任务向量定义了预训练模型权重空间中的一个方向,沿该方向移动可提升目标任务的性能。我们通过从任务微调后的模型权重中减去预训练模型的权重来构建任务向量。实验表明,这些任务向量可通过取反、加法等算术运算进行修改与组合,从而使模型行为随之变化。取反任务向量会降低目标任务上的性能,但对控制任务上的模型行为影响甚微。此外,将多个任务向量相加可同时提升多个任务的性能。最后,当任务通过类比关系(形如“A之于B如同C之于D”)关联时,即使未使用第四个任务的训练数据,组合其中三个任务的任务向量也能提升该任务的性能。总体而言,我们在多个模型、模态及任务上的实验表明,任务算术是一种简单、高效且有效的模型编辑方法。