Biomedical signal classification presents unique challenges due to long sequences, complex temporal dynamics, and multi-scale frequency patterns that are poorly captured by standard transformer architectures. We propose WaveFormer, a transformer architecture that integrates wavelet decomposition at two critical stages: embedding construction, where multi-channel Discrete Wavelet Transform (DWT) extracts frequency features to create tokens containing both time-domain and frequency-domain information, and positional encoding, where Dynamic Wavelet Positional Encoding (DyWPE) adapts position embeddings to signal-specific temporal structure through mono-channel DWT analysis. We evaluate WaveFormer on eight diverse datasets spanning human activity recognition and brain signal analysis, with sequence lengths ranging from 50 to 3000 timesteps and channel counts from 1 to 144. Experimental results demonstrate that WaveFormer achieves competitive performance through comprehensive frequency-aware processing. Our approach provides a principled framework for incorporating frequency-domain knowledge into transformer-based time series classification.
翻译:生物医学信号分类因其序列长、时序动态复杂且包含多尺度频率模式,而标准Transformer架构难以有效捕捉这些特征,从而面临独特挑战。我们提出WaveFormer,一种在两个关键阶段集成小波分解的Transformer架构:在嵌入构建阶段,多通道离散小波变换提取频率特征以生成同时包含时域和频域信息的令牌;在位置编码阶段,动态小波位置编码通过单通道离散小波分析使位置嵌入适应信号特定的时序结构。我们在涵盖人类活动识别与脑信号分析的八个多样化数据集上评估WaveFormer,其序列长度范围为50至3000个时间步,通道数从1到144不等。实验结果表明,WaveFormer通过全面的频率感知处理实现了具有竞争力的性能。我们的方法为将频域知识整合到基于Transformer的时间序列分类中提供了一个原理性框架。