With the growth of LLMs' (Large Language Models) capabilities, there has been an increasing push to curate high quality datasets by filtering samples in the training data. In general, Data Attribution (DA) methods aim to estimate how individual samples in a training dataset can precondition a model to generate certain outputs. As an example, one might be interested in which samples in the data could be the source of toxic behavior after training the LLM. Many methods quantify this conditioning through the paradigm of influence functions. While methods of this family are effective in its function, they lack the necessary processing speed and storage compactness to be practically implemented on large datasets. We propose a method, Influcoder, as a quick and cost-effective approach to influence-based Data Attribution at scale.
翻译:随着大语言模型(LLM)能力的提升,通过过滤训练数据中的样本来构建高质量数据集的需求日益增长。通常,数据归因方法旨在估计训练数据中每个样本如何预条件化模型,使其生成特定输出。例如,研究者可能关注数据中哪些样本在训练后可能成为LLM产生毒性行为的源头。许多方法通过影响函数的范式来量化这种条件作用。尽管这类方法在功能上有效,但其处理速度和存储紧凑性不足,难以在大规模数据集中实际部署。我们提出Influcoder方法,这是一种基于影响函数的数据归因方法,能够在规模上实现快速且成本高效的运作。