Influence functions (IFs) elucidate how training data changes model behavior. However, the increasing size and non-convexity in large-scale models make IFs inaccurate. We suspect that the fragility comes from the first-order approximation which may cause nuisance changes in parameters irrelevant to the examined data. However, simply computing influence from the chosen parameters can be misleading, as it fails to nullify the hidden effects of unselected parameters on the analyzed data. Thus, our approach introduces generalized IFs, precisely estimating target parameters' influence while nullifying nuisance gradient changes on fixed parameters. We identify target update parameters closely associated with the input data by the output- and gradient-based parameter selection methods. We verify the generalized IFs with various alternatives of IFs on the class removal and label change tasks. The experiments align with the "less is more" philosophy, demonstrating that updating only 5\% of the model produces more accurate results than other influence functions across all tasks. We believe our proposal works as a foundational tool for optimizing models, conducting data analysis, and enhancing AI interpretability beyond the limitation of IFs. Codes are available at https://github.com/hslyu/GIF.
翻译:影响函数(IFs)阐明了训练数据如何改变模型行为。然而,随着大模型规模增大及非凸性增强,影响函数的准确性显著下降。我们推测这种脆弱性源于一阶近似,其可能导致与待分析数据无关的参数产生干扰性变化。但仅从选定参数计算影响会产生误导,因为该方法无法消除未选参数对分析数据产生的隐式效应。因此,我们提出广义影响函数,通过精确估计目标参数影响的同时消除固定参数上干扰性梯度变化。我们利用基于输出和梯度的参数选择方法,识别与输入数据高度相关的目标更新参数。通过类别移除与标签变更任务的对比实验,我们验证了广义影响函数相较于其他影响函数变体的优势。实验结果印证"少即是多"思想——在所有任务中,仅更新模型5%的参数即可产生比其他影响函数更精确的结果。我们相信该工作可作为优化模型、数据分析及增强AI可解释性的基础工具,突破传统影响函数的局限性。代码已开源:https://github.com/hslyu/GIF