Time series classification is an important analytical task across diverse domains. However, its practical application is often hindered by the scarcity of labeled data and the requirement for substantial computational resources. To address these challenges, this paper proposes EvoTSC, a novel genetic programming approach designed to automatically evolve lightweight feature learning models for time series classification. The core of EvoTSC is a carefully designed multi-layer program structure that strategically embeds diverse forms of prior expert knowledge into the evolutionary process, effectively guiding the search toward operations known to be highly effective for time series analysis. To mitigate the common overfitting problem in time series classification, a tailored Pareto tournament selection strategy is proposed to favor models that perform consistently well across varying training data subsets, promoting the discovery of highly generalizable models. Extensive experiments conducted on univariate time series classification datasets demonstrate that EvoTSC significantly outperforms eleven benchmark methods in most comparisons. Further analyses verify the contribution of each component and the resource efficiency of the evolved models.
翻译:时序分类是跨多个领域的重要分析任务。然而,其实际应用常受限于标注数据稀缺与对大量计算资源的需求。为应对这些挑战,本文提出EvoTSC,一种新颖的遗传编程方法,旨在自动演化轻量级特征学习模型以用于时序分类。EvoTSC的核心是精心设计的多层程序结构,该结构将多种形式的先验专家知识策略性嵌入演化过程,有效引导搜索朝着已知对时序分析高度有效的操作方向进行。为缓解时序分类中常见的过拟合问题,提出了一种定制的帕累托锦标赛选择策略,该策略倾向于选择在不同训练数据子集上均表现一致的模型,从而促进高泛化能力模型的发现。在单变量时序分类数据集上开展的大量实验表明,EvoTSC在大多数比较中显著优于十一种基准方法。进一步的分析验证了各组件的贡献以及演化模型的资源效率。