Large language models (LLMs) suffer from temporal misalignment issues especially across long span of time. The issue arises from knowing that LLMs are trained on large amounts of data where temporal information is rather sparse over long times, such as thousands of years, resulting in insufficient learning or catastrophic forgetting by the LLMs. This paper proposes a methodology named "Ticktack" for addressing the LLM's long-time span misalignment in a yearly setting. Specifically, we first propose to utilize the sexagenary year expression instead of the Gregorian year expression employed by LLMs, achieving a more uniform distribution in yearly granularity. Then, we employ polar coordinates to model the sexagenary cycle of 60 terms and the year order within each term, with additional temporal encoding to ensure LLMs understand them. Finally, we present a temporal representational alignment approach for post-training LLMs that effectively distinguishes time points with relevant knowledge, hence improving performance on time-related tasks, particularly over a long period. We also create a long time span benchmark for evaluation. Experimental results prove the effectiveness of our proposal.
翻译:大语言模型(LLMs)存在时序错位问题,尤其在跨越长时间跨度时尤为突出。该问题的根源在于,LLMs 是在海量数据上训练的,而这些数据中跨越数千年等长时间尺度的时序信息极为稀疏,导致模型学习不足或发生灾难性遗忘。本文提出了一种名为 "Ticktack" 的方法,旨在解决 LLMs 在年度设定下的长时间跨度时序错位问题。具体而言,我们首先提出使用干支纪年表达替代 LLMs 通常采用的公历纪年表达,从而在年度粒度上实现更均匀的分布。接着,我们采用极坐标对包含60个周期的干支循环以及每个周期内的年份顺序进行建模,并辅以额外的时序编码以确保 LLMs 能够理解它们。最后,我们提出了一种用于 LLMs 后训练的时间表征对齐方法,该方法能有效区分具有相关知识的时间点,从而提升模型在时间相关任务上的性能,尤其是在长时期跨度上的表现。我们还创建了一个用于评估的长时间跨度基准测试集。实验结果证明了我们方法的有效性。