Transfer learning is a common practice that alleviates the need for extensive data to train neural networks. It is performed by pre-training a model using a source dataset and fine-tuning it for a target task. However, not every source dataset is appropriate for each target dataset, especially for time series. In this paper, we propose a novel method of selecting and using multiple datasets for transfer learning for time series classification. Specifically, our method combines multiple datasets as one source dataset for pre-training neural networks. Furthermore, for selecting multiple sources, our method measures the transferability of datasets based on shapelet discovery for effective source selection. While traditional transferability measures require considerable time for pre-training all the possible sources for source selection of each possible architecture, our method can be repeatedly used for every possible architecture with a single simple computation. Using the proposed method, we demonstrate that it is possible to increase the performance of temporal convolutional neural networks (CNN) on time series datasets.
翻译:迁移学习是一种常见实践,能够缓解训练神经网络对大量数据的需求。该方法通过使用源数据集对模型进行预训练,并针对目标任务进行微调来实现。然而,并非每个源数据集都适用于特定的目标数据集,尤其在时间序列领域。本文提出了一种新颖的方法,用于为时间序列分类的迁移学习选择和使用多个数据集。具体而言,我们的方法将多个数据集合并为一个源数据集,用于神经网络的预训练。此外,在多源选择方面,我们的方法基于形状基元发现来度量数据集的可迁移性,从而实现有效的源选择。传统的可迁移性度量方法需要对所有可能的源进行预训练以完成每个可能架构的源选择,耗时巨大;而我们的方法仅需通过一次简单计算,即可重复应用于所有可能的架构。通过使用所提出的方法,我们证明了该方法能够提升时序卷积神经网络(CNN)在时间序列数据集上的性能。