Model Selection with a Shapelet-based Distance Measure for Multi-source Transfer Learning in Time Series Classification

Transfer learning is a common practice that alleviates the need for extensive data to train neural networks. It is performed by pre-training a model using a source dataset and fine-tuning it for a target task. However, not every source dataset is appropriate for each target dataset, especially for time series. In this paper, we propose a novel method of selecting and using multiple datasets for transfer learning for time series classification. Specifically, our method combines multiple datasets as one source dataset for pre-training neural networks. Furthermore, for selecting multiple sources, our method measures the transferability of datasets based on shapelet discovery for effective source selection. While traditional transferability measures require considerable time for pre-training all the possible sources for source selection of each possible architecture, our method can be repeatedly used for every possible architecture with a single simple computation. Using the proposed method, we demonstrate that it is possible to increase the performance of temporal convolutional neural networks (CNN) on time series datasets.

翻译：迁移学习是一种常见实践，能够缓解训练神经网络对大量数据的需求。该方法通过使用源数据集对模型进行预训练，并针对目标任务进行微调来实现。然而，并非每个源数据集都适用于特定的目标数据集，尤其在时间序列领域。本文提出了一种新颖的方法，用于为时间序列分类的迁移学习选择和使用多个数据集。具体而言，我们的方法将多个数据集合并为一个源数据集，用于神经网络的预训练。此外，在多源选择方面，我们的方法基于形状基元发现来度量数据集的可迁移性，从而实现有效的源选择。传统的可迁移性度量方法需要对所有可能的源进行预训练以完成每个可能架构的源选择，耗时巨大；而我们的方法仅需通过一次简单计算，即可重复应用于所有可能的架构。通过使用所提出的方法，我们证明了该方法能够提升时序卷积神经网络（CNN）在时间序列数据集上的性能。

相关内容

数据集

关注 88

数据集，又称为资料集、数据集合或资料集合，是一种由数据所组成的集合。
Data set（或dataset）是一个数据的集合，通常以表格形式出现。每一列代表一个特定变量。每一行都对应于某一成员的数据集的问题。它列出的价值观为每一个变量，如身高和体重的一个物体或价值的随机数。每个数值被称为数据资料。对应于行数，该数据集的数据可能包括一个或多个成员。

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

语言视觉预训练语言模型揭密，Behind the Scene: Revealing the Secrets of Pre-trained Vision-and-Language Models

专知会员服务

36+阅读 · 2020年5月20日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日