Data Augmentation (DA) has emerged as an indispensable strategy in Time Series Classification (TSC), primarily due to its capacity to amplify training samples, thereby bolstering model robustness, diversifying datasets, and curtailing overfitting. However, the current landscape of DA in TSC is plagued with fragmented literature reviews, nebulous methodological taxonomies, inadequate evaluative measures, and a dearth of accessible, user-oriented tools. In light of these challenges, this study embarks on an exhaustive dissection of DA methodologies within the TSC realm. Our initial approach involved an extensive literature review spanning a decade, revealing that contemporary surveys scarcely capture the breadth of advancements in DA for TSC, prompting us to meticulously analyze over 100 scholarly articles to distill more than 60 unique DA techniques. This rigorous analysis precipitated the formulation of a novel taxonomy, purpose-built for the intricacies of DA in TSC, categorizing techniques into five principal echelons: Transformation-Based, Pattern-Based, Generative, Decomposition-Based, and Automated Data Augmentation. Our taxonomy promises to serve as a robust navigational aid for scholars, offering clarity and direction in method selection. Addressing the conspicuous absence of holistic evaluations for prevalent DA techniques, we executed an all-encompassing empirical assessment, wherein upwards of 15 DA strategies were subjected to scrutiny across 8 UCR time-series datasets, employing ResNet and a multi-faceted evaluation paradigm encompassing Accuracy, Method Ranking, and Residual Analysis, yielding a benchmark accuracy of 88.94 +- 11.83%. Our investigation underscored the inconsistent efficacies of DA techniques, with...
翻译:数据增强(DA)已成为时间序列分类(TSC)中不可或缺的策略,主要因其能够扩充训练样本,从而增强模型鲁棒性、丰富数据集并减少过拟合。然而,当前TSC中DA的研究现状面临文献综述碎片化、方法分类模糊、评估措施不充分以及缺乏面向用户的易用工具等挑战。针对这些问题,本研究系统剖析了TSC领域中的DA方法。我们首先进行了为期十年的广泛文献调研,发现现有综述鲜少全面涵盖TSC中DA的最新进展,由此深入分析了超过100篇学术文献,提炼出60余种独特的DA技术。通过严谨分析,我们构建了一种专为TSC中DA复杂性设计的新型分类体系,将技术划分为五大层级:基于变换、基于模式、生成式、基于分解和自动化数据增强。该分类体系有望为学者提供清晰的导航指引,助力方法选择的清晰性与方向性。针对现有DA技术整体评估缺失的问题,我们执行了全面的实证评估,在8个UCR时间序列数据集上对超过15种DA策略进行了测试,采用ResNet及包含准确率、方法排序与残差分析的多维度评估范式,最终获得88.94±11.83%的基准准确率。研究揭示了DA技术效用的不一致性,其中...