Reliable uncertainty estimation is a key missing piece for on-device monitoring in TinyML: microcontrollers must detect failures, distribution shift, or accuracy drops under strict flash/latency budgets, yet common uncertainty approaches (deep ensembles, MC dropout, early exits, temporal buffering) typically require multiple passes, extra branches, or state that is impractical on milliwatt hardware. This paper proposes a novel and practical method, SNAP-UQ, for single-pass, label-free uncertainty estimation based on depth-wise next-activation prediction. SNAP-UQ taps a small set of backbone layers and uses tiny int8 heads to predict the mean and scale of the next activation from a low-rank projection of the previous one; the resulting standardized prediction error forms a depth-wise surprisal signal that is aggregated and mapped through a lightweight monotone calibrator into an actionable uncertainty score. The design introduces no temporal buffers or auxiliary exits and preserves state-free inference, while increasing deployment footprint by only a few tens of kilobytes. Across vision and audio backbones, SNAP-UQ reduces flash and latency relative to early-exit and deep-ensemble baselines (typically $\sim$40--60% smaller and $\sim$25--35% faster), with several competing methods at similar accuracy often exceeding MCU memory limits. On corrupted streams, it improves accuracy-drop event detection by multiple AUPRC points and maintains strong failure detection (AUROC $\approx 0.9$) in a single forward pass. By grounding uncertainty in layer-to-layer dynamics rather than solely in output confidence, SNAP-UQ offers a novel, resource-efficient basis for robust TinyML monitoring. Our code is available at: https://github.com/Ism-ail11/SNAP-UQ
翻译:可靠的**不确定性估计**是微型机器学习(TinyML)设备端监测中缺失的关键环节:微控制器必须在严格的闪存/延迟预算下检测故障、分布偏移或准确率下降,然而常见的不确定性方法(深度集成、蒙特卡洛丢弃、早期退出、时序缓冲)通常需要多次前向传播、额外分支或不适用于毫瓦级硬件的状态存储。本文提出一种新颖且实用的方法——SNAP-UQ,用于基于深度方向下一激活预测的**单次前向传播、无标签不确定性估计**。SNAP-UQ 选取主干网络中的一小部分层,使用微型的 int8 预测头,通过前一激活的低秩投影来预测下一激活的均值与尺度;由此得到的标准化预测误差形成一个深度方向的**惊奇度信号**,该信号经聚合后通过一个轻量级单调校准器映射为可操作的不确定性分数。该设计无需时序缓冲器或辅助退出分支,保持了**无状态推理**特性,同时仅增加数十千字节的部署开销。在视觉与音频主干网络上的实验表明,相较于早期退出和深度集成基线方法,SNAP-UQ 显著降低了闪存占用与延迟(通常减少约 40–60%,提速约 25–35%),而多种在相似准确率下与之竞争的方法常超出微控制器内存限制。在受损数据流上,SNAP-UQ 将准确率下降事件检测的 AUPRC 提升了多个百分点,并在单次前向传播中保持了强大的故障检测能力(AUROC ≈ 0.9)。通过将不确定性建立在层间动态特性而非仅输出置信度上,SNAP-UQ 为鲁棒的 TinyML 监测提供了一种新颖且资源高效的基础方案。我们的代码公开于:https://github.com/Ism-ail11/SNAP-UQ