Due to scarcity of time-series data annotated with descriptive texts, training a model to generate descriptive texts for time-series data is challenging. In this study, we propose a method to systematically generate domain-independent descriptive texts from time-series data. We identify two distinct approaches for creating pairs of time-series data and descriptive texts: the forward approach and the backward approach. By implementing the novel backward approach, we create the Temporal Automated Captions for Observations (TACO) dataset. Experimental results demonstrate that a contrastive learning based model trained using the TACO dataset is capable of generating descriptive texts for time-series data in novel domains.
翻译:由于带有描述文本标注的时间序列数据稀缺,训练模型为时间序列数据生成描述文本具有挑战性。在本研究中,我们提出了一种从时间序列数据系统生成领域无关描述文本的方法。我们识别了创建时间序列数据与描述文本对的两种不同途径:正向途径与逆向途径。通过实现新颖的逆向途径,我们构建了时序观测自动标注(TACO)数据集。实验结果表明,使用TACO数据集训练的基于对比学习的模型能够为新颖领域的时间序列数据生成描述文本。