We present time vectors, a simple tool to customize language models to new time periods. Time vectors are created by finetuning a language model on data from a single time (e.g., a year or month), and then subtracting the weights of the original pretrained model. This vector specifies a direction in weight space that, as our experiments show, improves performance on text from that time period. Time vectors specialized to adjacent time periods appear to be positioned closer together in a manifold. Using this structure, we interpolate between time vectors to induce new models that perform better on intervening and future time periods, without any additional training. We demonstrate the consistency of our findings across different tasks, domains, model sizes, and time scales. Our results suggest that time is encoded in the weight space of finetuned models.
翻译:我们提出了时间向量这一简单工具,用于将语言模型定制到新的时间段。时间向量通过在一个单一时间段(如某年或某月)的数据上微调语言模型,然后减去原始预训练模型的权重而得到。该向量指定了权重空间中的一个方向,如我们的实验所示,它能够提升模型对该时间段文本的处理性能。针对相邻时间段专门化的时间向量似乎在该流形中位置更接近。利用这种结构,我们在时间向量之间进行插值,以生成在新模型上表现更好的中间及未来时间段,且无需额外训练。我们证明了研究结果在不同任务、领域、模型大小及时间尺度上的一致性。我们的结果表明,时间被编码在微调模型的权重空间中。