Biosignals exhibit substantial cross-subject and cross-session variability, inducing severe domain shifts that degrade post-deployment performance for small, edge-oriented AI models. On-device adaptation is therefore essential to both preserve user privacy and ensure system reliability. However, existing sub-100 mW MCU-based wearable platforms can only support shallow or sparse adaptation schemes due to the prohibitive memory footprint and computational cost of full backpropagation (BP). In this paper, we propose BioTrain, a framework enabling full-network fine-tuning of state-of-the-art biosignal models under milliwatt-scale power and sub-megabyte memory constraints. We validate BioTrain using both offline and on-device benchmarks on EEG and EOG datasets, covering Day-1 new-subject calibration and longitudinal adaptation to signal drift. Experimental results show that full-network fine-tuning achieves accuracy improvements of up to 35% over non-adapted baselines and outperforms last-layer updates by approximately 7% during new-subject calibration. On the GAP9 MCU platform, BioTrain enables efficient on-device training throughput of 17 samples/s for EEG and 85 samples/s for EOG models within a power envelope below 50 mW. In addition, BioTrain's efficient memory allocator and network topology optimization enable the use of a large batch size, reducing peak memory usage. For fully on-chip BP on GAP9, BioTrain reduces the memory footprint by 8.1x, from 5.4 MB to 0.67 MB, compared to conventional full-network fine-tuning using batch normalization with batch size 8.
翻译:生物信号在跨受试者和跨会话间存在显著变异性,导致严重域偏移,从而降低小型边缘AI模型部署后的性能。因此,设备端自适应对于保护用户隐私和确保系统可靠性都至关重要。然而,现有的基于亚100 mW微控制器单元(MCU)的可穿戴平台,由于全反向传播(BP)的内存占用和计算成本过高,只能支持浅层或稀疏的自适应方案。本文提出BioTrain框架,该框架能够在毫瓦级功耗和亚兆字节内存约束下,实现最先进生物信号模型的全网络微调。我们使用脑电图(EEG)和眼电图(EOG)数据集,通过离线与设备端基准测试验证BioTrain,涵盖首日新受试者标定和针对信号漂移的纵向自适应。实验结果表明,全网络微调相比未自适应基线可提升高达35%的准确率,在新受试者标定中比仅更新最后一层的方法高出约7%。在GAP9 MCU平台上,BioTrain在低于50 mW的功耗范围内,实现了EEG模型17样本/秒和EOG模型85样本/秒的高效设备端训练吞吐量。此外,BioTrain的高效内存分配器和网络拓扑优化支持使用大批量训练,降低了峰值内存使用。对于GAP9上的全芯片内BP,与使用批量归一化且批量大小为8的传统全网络微调相比,BioTrain将内存占用从5.4 MB降至0.67 MB,减少了8.1倍。