This work summarizes two strategies for completing time-series (TS) tasks using today's language model (LLM): LLM-for-TS, design and train a fundamental large model for TS data; TS-for-LLM, enable the pre-trained LLM to handle TS data. Considering the insufficient data accumulation, limited resources, and semantic context requirements, this work focuses on TS-for-LLM methods, where we aim to activate LLM's ability for TS data by designing a TS embedding method suitable for LLM. The proposed method is named TEST. It first tokenizes TS, builds an encoder to embed them by instance-wise, feature-wise, and text-prototype-aligned contrast, and then creates prompts to make LLM more open to embeddings, and finally implements TS tasks. Experiments are carried out on TS classification and forecasting tasks using 8 LLMs with different structures and sizes. Although its results cannot significantly outperform the current SOTA models customized for TS tasks, by treating LLM as the pattern machine, it can endow LLM's ability to process TS data without compromising the language ability. This paper is intended to serve as a foundational work that will inspire further research.
翻译:本文总结了利用当前语言模型(LLM)完成时间序列(TS)任务的两种策略:LLM-for-TS,即为TS数据设计和训练基础大模型;TS-for-LLM,使预训练的LLM能够处理TS数据。考虑到数据积累不足、资源有限及语义上下文需求,本文聚焦于TS-for-LLM方法,旨在通过设计适合LLM的TS嵌入方法来激活LLM处理TS数据的能力。所提方法命名为TEST。它首先对TS进行分词,构建编码器通过实例级、特征级和文本原型对齐对比进行嵌入,然后创建提示使LLM更易接受嵌入,最终实现TS任务。采用8种不同结构与规模的LLM,在TS分类与预测任务上开展实验。尽管其结果未显著超越当前专为TS任务定制的SOTA模型,但通过将LLM视为模式机器,该方法能在不损害语言能力的前提下赋予LLM处理TS数据的能力。本文旨在作为基础性工作,为后续研究提供启发。