How do large language models (LLMs) develop and evolve over the course of training? How do these patterns change as models scale? To answer these questions, we introduce \textit{Pythia}, a suite of 16 LLMs all trained on public data seen in the exact same order and ranging in size from 70M to 12B parameters. We provide public access to 154 checkpoints for each one of the 16 models, alongside tools to download and reconstruct their exact training dataloaders for further study. We intend \textit{Pythia} to facilitate research in many areas, and we present several case studies including novel results in memorization, term frequency effects on few-shot performance, and reducing gender bias. We demonstrate that this highly controlled setup can be used to yield novel insights toward LLMs and their training dynamics. Trained models, analysis code, training code, and training data can be found at https://github.com/EleutherAI/pythia.
翻译:大语言模型(LLMs)在训练过程中如何发展演化?随着模型规模的扩展,这些模式如何变化?为回答这些问题,我们提出了\textit{Pythia}——一套包含16个LLMs的套件,所有模型均在完全相同的公共数据顺序上训练,参数规模从7000万到120亿不等。我们公开了每个16个模型的154个检查点,并提供工具以下载和重建其精确的训练数据加载器以供进一步研究。我们期望\textit{Pythia}能促进多个领域的研究,并展示了若干案例研究,包括记忆化、词汇频率对少样本性能的影响以及减少性别偏见等新发现。我们证明了这种高度受控的设置可用于对LLMs及其训练动态产生新见解。训练完成的模型、分析代码、训练代码及训练数据均可访问 https://github.com/EleutherAI/pythia 获取。