Quantifying the impact of training data points is crucial for understanding the outputs of machine learning models and for improving the transparency of the AI pipeline. The influence function is a principled and popular data attribution method, but its computational cost often makes it challenging to use. This issue becomes more pronounced in the setting of large language models and text-to-image models. In this work, we propose DataInf, an efficient influence approximation method that is practical for large-scale generative AI models. Leveraging an easy-to-compute closed-form expression, DataInf outperforms existing influence computation algorithms in terms of computational and memory efficiency. Our theoretical analysis shows that DataInf is particularly well-suited for parameter-efficient fine-tuning techniques such as LoRA. Through systematic empirical evaluations, we show that DataInf accurately approximates influence scores and is orders of magnitude faster than existing methods. In applications to RoBERTa-large, Llama-2-13B-chat, and stable-diffusion-v1.5 models, DataInf effectively identifies the most influential fine-tuning examples better than other approximate influence scores. Moreover, it can help to identify which data points are mislabeled.
翻译:量化训练数据点的影响对于理解机器学习模型的输出以及提升AI管线的透明度至关重要。影响函数是一种理论严谨且广泛使用的数据归因方法,但其计算成本通常使其难以应用。这一挑战在大语言模型和文本到图像模型中尤为突出。在本工作中,我们提出DataInf,一种高效的影响近似方法,适用于大规模生成式AI模型。通过利用易于计算的闭式表达式,DataInf在计算效率和内存效率方面均优于现有影响计算算法。我们的理论分析表明,DataInf特别适用于参数高效微调技术(如LoRA)。通过系统的实证评估,我们证明DataInf能够准确近似影响分数,并且其速度比现有方法快数个数量级。在RoBERTa-large、Llama-2-13B-chat和stable-diffusion-v1.5模型的应用中,DataInf比其它近似影响分数更有效地识别出最具影响力的微调样本。此外,它还能帮助识别哪些数据点被错误标注。