We present time vectors, a simple tool to customize language models to new time periods. Time vectors are created by finetuning a language model on data from a single time (e.g., a year or month), and then subtracting the weights of the original pretrained model. This vector specifies a direction in weight space that, as our experiments show, improves performance on text from that time period. Time vectors specialized to adjacent time periods appear to be positioned closer together in a manifold. Using this structure, we interpolate between time vectors to induce new models that perform better on intervening and future time periods, without any additional training. We demonstrate the consistency of our findings across different tasks, domains, model sizes, and time scales. Our results suggest that time is encoded in the weight space of finetuned models.
翻译:我们提出时间向量这一简单工具,用于将语言模型定制至新的时间周期。时间向量通过将语言模型在单一时间(如某年或某月)的数据上进行微调,随后减去原始预训练模型的权重而生成。正如我们的实验所示,该向量在权重空间中指定了一个方向,能提升模型对该时间段文本的处理性能。专门针对相邻时间周期的时间向量在流形中似乎位置更为接近。利用这一结构,我们通过时间向量间的插值,无需额外训练即可生成在中间及未来时间周期表现更优的新模型。我们验证了研究结果在不同任务、领域、模型规模和时间尺度上的一致性。这些结果提示,时间信息确实编码于微调模型的权重空间之中。