Influence functions serve as crucial tools for assessing sample influence in model interpretation, subset training set selection, noisy label detection, and more. By employing the first-order Taylor extension, influence functions can estimate sample influence without the need for expensive model retraining. However, applying influence functions directly to deep models presents challenges, primarily due to the non-convex nature of the loss function and the large size of model parameters. This difficulty not only makes computing the inverse of the Hessian matrix costly but also renders it non-existent in some cases. Various approaches, including matrix decomposition, have been explored to expedite and approximate the inversion of the Hessian matrix, with the aim of making influence functions applicable to deep models. In this paper, we revisit a specific, albeit naive, yet effective approximation method known as TracIn. This method substitutes the inverse of the Hessian matrix with an identity matrix. We provide deeper insights into why this simple approximation method performs well. Furthermore, we extend its applications beyond measuring model utility to include considerations of fairness and robustness. Finally, we enhance TracIn through an ensemble strategy. To validate its effectiveness, we conduct experiments on synthetic data and extensive evaluations on noisy label detection, sample selection for large language model fine-tuning, and defense against adversarial attacks.
翻译:影响函数作为评估样本影响力的关键工具,广泛应用于模型解释、训练子集选择、噪声标签检测等领域。通过一阶泰勒展开,影响函数无需昂贵的模型重训练即可估计样本影响力。然而,将影响函数直接应用于深度模型面临挑战,主要源于损失函数的非凸性以及模型参数规模庞大。这一困难不仅使海森矩阵逆的计算代价高昂,在某些情况下甚至导致其不存在。学界已探索多种方法(包括矩阵分解)来加速和近似海森矩阵求逆,以期使影响函数适用于深度模型。本文重新审视了一种虽朴素却有效的近似方法——TracIn,该方法用单位矩阵替代海森矩阵的逆。我们深入阐释了这种简单近似方法表现优异的原因,并将其应用从模型效用评估拓展至公平性与鲁棒性考量。最终,通过集成策略对TracIn进行了增强。为验证其有效性,我们在合成数据上开展实验,并在噪声标签检测、大语言模型微调的样本选择以及对抗攻击防御等场景中进行了广泛评估。