While large language models (LLMs) have demonstrated superior multi-task capabilities, understanding the learning mechanisms behind this is still a challenging problem. In this paper, we attempt to understand such mechanisms from the perspective of neurons. Specifically, we detect task-sensitive neurons in LLMs via gradient attribution on task-specific data. Through extensive deactivation and fine-tuning experiments, we demonstrate that the detected neurons are highly correlated with the given task, which we term as task-specific neurons. With these identified task-specific neurons, we delve into two common problems in multi-task learning and continuous learning: Generalization and Catastrophic Forgetting. We find that the overlap of task-specific neurons is strongly associated with generalization and specialization across tasks. Interestingly, at certain layers of LLMs, there is a high similarity in the parameters of different task-specific neurons, and such similarity is highly correlated with the generalization performance. Inspired by these findings, we propose a neuron-level continuous fine-tuning method that only fine-tunes the current task-specific neurons during continuous learning, and extensive experiments demonstrate the effectiveness of the proposed method. Our study provides insights into the interpretability of LLMs in multi-task learning.
翻译:尽管大型语言模型(LLMs)已展现出卓越的多任务处理能力,理解其背后的学习机制仍是一个具有挑战性的问题。本文尝试从神经元视角探究这一机制。具体而言,我们通过在任务特定数据上进行梯度归因分析,检测LLMs中对任务敏感的神经元。通过大量神经元失活与微调实验,我们证明所检测的神经元与给定任务高度相关,并将其定义为任务特定神经元。基于这些识别出的任务特定神经元,我们深入探究多任务学习与持续学习中的两个常见问题:泛化与灾难性遗忘。研究发现,任务特定神经元的重叠程度与任务间的泛化及特化能力密切相关。有趣的是,在LLMs的某些层级中,不同任务特定神经元的参数表现出高度相似性,且这种相似性与泛化性能强相关。受此启发,我们提出一种神经元级持续微调方法,该方法在持续学习过程中仅对当前任务特定神经元进行微调,大量实验验证了所提方法的有效性。本研究为理解LLMs在多任务学习中的可解释性提供了新的视角。