Influence functions serve as crucial tools for assessing sample influence in model interpretation, subset training set selection, noisy label detection, and more. By employing the first-order Taylor extension, influence functions can estimate sample influence without the need for expensive model retraining. However, applying influence functions directly to deep models presents challenges, primarily due to the non-convex nature of the loss function and the large size of model parameters. This difficulty not only makes computing the inverse of the Hessian matrix costly but also renders it non-existent in some cases. Various approaches, including matrix decomposition, have been explored to expedite and approximate the inversion of the Hessian matrix, with the aim of making influence functions applicable to deep models. In this paper, we revisit a specific, albeit naive, yet effective approximation method known as TracIn. This method substitutes the inverse of the Hessian matrix with an identity matrix. We provide deeper insights into why this simple approximation method performs well. Furthermore, we extend its applications beyond measuring model utility to include considerations of fairness and robustness. Finally, we enhance TracIn through an ensemble strategy. To validate its effectiveness, we conduct experiments on synthetic data and extensive evaluations on noisy label detection, sample selection for large language model fine-tuning, and defense against adversarial attacks.
翻译:影响函数是评估模型解释中样本影响、子集训练集选择、噪声标签检测等方面的重要工具。通过采用一阶泰勒展开,影响函数能够在无需昂贵模型重训练的情况下估计样本影响。然而,将影响函数直接应用于深度模型存在挑战,主要源于损失函数的非凸性以及模型参数规模庞大。这一困难不仅使得计算海森矩阵的逆矩阵成本高昂,在某些情况下甚至导致其不存在。已有多种方法,包括矩阵分解,被探索用于加速和近似海森矩阵的求逆,旨在使影响函数适用于深度模型。本文中,我们重新审视一种特定、尽管朴素但有效的近似方法,称为TracIn。该方法将海森矩阵的逆矩阵替换为单位矩阵。我们更深入地探讨了为何这种简单近似方法表现良好。此外,我们将其应用从衡量模型效用扩展到涵盖公平性和鲁棒性的考量。最后,我们通过集成策略增强了TracIn。为验证其有效性,我们在合成数据上进行了实验,并在噪声标签检测、大语言模型微调的样本选择以及对抗攻击防御方面进行了广泛评估。