WEASEL 2.0 -- A Random Dilated Dictionary Transform for Fast, Accurate and Memory Constrained Time Series Classification

A time series is a sequence of sequentially ordered real values in time. Time series classification (TSC) is the task of assigning a time series to one of a set of predefined classes, usually based on a model learned from examples. Dictionary-based methods for TSC rely on counting the frequency of certain patterns in time series and are important components of the currently most accurate TSC ensembles. One of the early dictionary-based methods was WEASEL, which at its time achieved SotA results while also being very fast. However, it is outperformed both in terms of speed and accuracy by other methods. Furthermore, its design leads to an unpredictably large memory footprint, making it inapplicable for many applications. In this paper, we present WEASEL 2.0, a complete overhaul of WEASEL based on two recent advancements in TSC: Dilation and ensembling of randomized hyper-parameter settings. These two techniques allow WEASEL 2.0 to work with a fixed-size memory footprint while at the same time improving accuracy. Compared to 15 other SotA methods on the UCR benchmark set, WEASEL 2.0 is significantly more accurate than other dictionary methods and not significantly worse than the currently best methods. Actually, it achieves the highest median accuracy over all data sets, and it performs best in 5 out of 12 problem classes. We thus believe that WEASEL 2.0 is a viable alternative for current TSC and also a potentially interesting input for future ensembles.

翻译：时间序列是按时间顺序排列的实数值序列。时间序列分类（TSC）是将时间序列分配至预定义类别之一的任务，通常基于从样本中学习的模型。基于字典的TSC方法依赖于统计时间序列中特定模式的出现频率，且是目前最精确的TSC集成方法的重要组成部分。早期基于字典的方法之一是WEASEL，它在当时取得了最优结果，同时运行速度极快。然而，它在速度和精度上均被其他方法超越。此外，其设计导致内存占用不可预测地增大，使其难以应用于诸多场景。本文提出WEASEL 2.0，这是基于TSC领域近期两项进展（膨胀与随机超参数设置的集成）对WEASEL的彻底重构。这两项技术使WEASEL 2.0在保持固定内存占用的同时提高了精度。在UCR基准数据集上与15种其他最优TSC方法相比，WEASEL 2.0的精度显著高于其他字典方法，且与当前最佳方法无显著差距。实际上，它在所有数据集上获得最高中位数精度，并在12个问题类别中的5个上表现最佳。因此，我们认为WEASEL 2.0是当前TSC的一个可行替代方案，且可能成为未来集成方法的潜在重要输入。

相关内容

TSC

关注 0

服务范围涵盖服务创新研发的所有计算和软件科学技术方面。IEEE服务计算事务强调算法、数学、统计和计算方法，这些方法是服务计算的核心，是面向服务的体系结构、Web服务、业务流程集成、解决方案性能管理、服务操作和管理的新兴领域。官网地址：http://dblp.uni-trier.de/db/journals/tsc/

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日