Large language models (LLMs) have demonstrated their effectiveness in multivariate time series classification (MTSC). Effective adaptation of LLMs for MTSC necessitates informative data representations. Existing LLM-based methods directly encode embeddings for time series within the latent space of LLMs from scratch to align with semantic space of LLMs. Despite their effectiveness, we reveal that these methods conceal three inherent bottlenecks: (1) they struggle to encode temporal and channel-specific information in a lossless manner, both of which are critical components of multivariate time series; (2) it is much difficult to align the learned representation space with the semantic space of the LLMs; (3) they require task-specific retraining, which is both computationally expensive and labor-intensive. To bridge these gaps, we propose TableTime, which reformulates MTSC as a table understanding task. Specifically, TableTime introduces the following strategies: (1) convert multivariate time series into a tabular form, thus minimizing information loss to the greatest extent; (2) represent tabular time series in text format to achieve natural alignment with the semantic space of LLMs; (3) design a reasoning framework that integrates contextual text information, neighborhood assistance, multi-path inference and problem decomposition to enhance the reasoning ability of LLMs and realize zero-shot classification. Extensive experiments performed on 10 publicly representative datasets from UEA archive verify the superiorities of the TableTime.
翻译:大语言模型(LLMs)在多变量时间序列分类(MTSC)中已展现出显著效能。将LLMs有效适配于MTSC任务需要具备信息丰富的数据表示。现有基于LLM的方法直接从零开始在LLM的潜在空间中编码时间序列嵌入,以对齐LLM的语义空间。尽管这些方法具有一定效果,我们发现其存在三个固有瓶颈:(1)难以以无损方式编码时间序列的关键组成部分——时序信息与通道特异性信息;(2)将学习到的表示空间与LLM的语义空间对齐极为困难;(3)需要进行任务特定的重新训练,计算成本高昂且耗费人力。为弥合这些差距,我们提出TableTime,将MTSC任务重新构建为表格理解问题。具体而言,TableTime引入以下策略:(1)将多变量时间序列转换为表格形式,从而最大限度减少信息损失;(2)以文本格式表示表格化时间序列,实现与LLM语义空间的自然对齐;(3)设计一个融合上下文文本信息、邻域辅助、多路径推理与问题分解的推理框架,以增强LLM的推理能力并实现零样本分类。在UEA档案库的10个公开代表性数据集上进行的大量实验验证了TableTime的优越性。