Scalable Influence and Fact Tracing for Large Language Model Pretraining

Training data attribution (TDA) methods aim to attribute model outputs back to specific training examples, and the application of these methods to large language model (LLM) outputs could significantly advance model transparency and data curation. However, it has been challenging to date to apply these methods to the full scale of LLM pretraining. In this paper, we refine existing gradient-based methods to work effectively at scale, allowing us to retrieve influential examples for an 8B-parameter language model from a pretraining corpus of over 160B tokens with no need for subsampling or pre-filtering. Our method combines several techniques, including optimizer state correction, a task-specific Hessian approximation, and normalized encodings, which we find to be critical for performance at scale. In quantitative evaluations on a fact tracing task, our method performs best at identifying examples that influence model predictions, but classical, model-agnostic retrieval methods such as BM25 still perform better at finding passages which explicitly contain relevant facts. These results demonstrate a misalignment between factual *attribution* and causal *influence*. With increasing model size and training tokens, we find that influence more closely aligns with factual attribution. Finally, we examine different types of examples identified as influential by our method, finding that while many directly entail a particular fact, others support the same output by reinforcing priors on relation types, common entities, and names. We release our prompt set and model outputs, along with a web-based visualization tool to explore influential examples for factual predictions, commonsense reasoning, arithmetic, and open-ended generation for an 8B-parameter LLM.

翻译：训练数据归因（TDA）方法旨在将模型输出归因于特定的训练样本，将这些方法应用于大语言模型（LLM）输出可显著提升模型透明度和数据管理能力。然而，迄今为止将这些方法应用于LLM预训练的全规模数据集仍具挑战性。本文通过改进现有基于梯度的方法，使其能够在大规模场景下有效工作，使我们能够从一个超过160B词元的预训练语料库中，为80亿参数的语言模型检索出有影响力的样本，无需进行子采样或预过滤。我们的方法结合了多种技术，包括优化器状态校正、任务特定的Hessian矩阵近似以及归一化编码，这些技术被证明对大规模性能至关重要。在事实溯源任务的定量评估中，我们的方法在识别影响模型预测的样本方面表现最佳，但经典的、与模型无关的检索方法（如BM25）在查找明确包含相关事实的段落方面仍表现更优。这些结果表明事实*归因*与因果*影响力*之间存在错位。随着模型规模和训练词元的增加，我们发现影响力与事实归因的匹配度更高。最后，我们分析了通过本方法识别出的不同类型的有影响力样本，发现虽然许多样本直接蕴含特定事实，但其他样本通过强化关系类型、常见实体和名称的先验知识来支持相同输出。我们发布了提示集和模型输出，以及一个基于网络的可视化工具，用于探索80亿参数LLM在事实预测、常识推理、算术和开放式生成任务中的有影响力样本。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日