Sourced from various sensors and organized chronologically, Multivariate Time-Series (MTS) data involves crucial spatial-temporal dependencies, e.g., correlations among sensors. To capture these dependencies, Graph Neural Networks (GNNs) have emerged as powerful tools, yet their effectiveness is restricted by the quality of graph construction from MTS data. Typically, existing approaches construct graphs solely from MTS signals, which may introduce bias due to a small training dataset and may not accurately represent underlying dependencies. To address this challenge, we propose a novel framework named K-Link, leveraging Large Language Models (LLMs) to encode extensive general knowledge and thereby providing effective solutions to reduce the bias. Leveraging the knowledge embedded in LLMs, such as physical principles, we extract a \textit{Knowledge-Link graph}, capturing vast semantic knowledge of sensors and the linkage of the sensor-level knowledge. To harness the potential of the knowledge-link graph in enhancing the graph derived from MTS data, we propose a graph alignment module, facilitating the transfer of semantic knowledge within the knowledge-link graph into the MTS-derived graph. By doing so, we can improve the graph quality, ensuring effective representation learning with GNNs for MTS data. Extensive experiments demonstrate the efficacy of our approach for superior performance across various MTS-related downstream tasks.
翻译:摘要:来源于多种传感器并按时间顺序组织的多变量时间序列数据包含关键的时空依赖关系,例如传感器之间的相关性。为捕捉这些依赖关系,图神经网络已成为有力工具,但其有效性受限于从多变量时间序列数据构建图的质量。现有方法通常仅从多变量时间序列信号构建图,这可能因训练数据集较小而引入偏差,且难以准确表征底层依赖关系。针对这一挑战,我们提出名为K-Link的新框架,利用大语言模型编码的广泛通用知识,从而提供减少偏差的有效方案。通过挖掘大语言模型中嵌入的知识(如物理原理),我们提取出知识关联图,该图捕获了传感器丰富的语义知识及传感器层级知识间的关联。为充分发挥知识关联图在增强多变量时间序列数据导出图方面的潜力,我们提出图对齐模块,促进知识关联图中的语义知识向多变量时间序列导出图的迁移。通过这种方式,可提升图质量,确保基于图神经网络对多变量时间序列数据进行有效的表示学习。大量实验证明,该方法在多种多变量时间序列相关下游任务中具有优越性能。