We present a highly compact run-time monitoring approach for deep computer vision networks that extracts selected knowledge from only a few (down to merely two) hidden layers, yet can efficiently detect silent data corruption originating from both hardware memory and input faults. Building on the insight that critical faults typically manifest as peak or bulk shifts in the activation distribution of the affected network layers, we use strategically placed quantile markers to make accurate estimates about the anomaly of the current inference as a whole. Importantly, the detector component itself is kept algorithmically transparent to render the categorization of regular and abnormal behavior interpretable to a human. Our technique achieves up to ~96% precision and ~98% recall of detection. Compared to state-of-the-art anomaly detection techniques, this approach requires minimal compute overhead (as little as 0.3% with respect to non-supervised inference time) and contributes to the explainability of the model.
翻译:我们提出一种高度紧凑的深度计算机视觉网络运行时监控方法,该方法仅从少数(可低至两个)隐藏层提取选择性知识,却能高效检测源自硬件内存故障与输入故障的静默数据损坏。基于关键故障通常表现为受影响的网络层激活分布中的峰值偏移或整体偏移这一见解,我们采用策略性放置的分位数标记,对当前推理的整体异常程度进行精确评估。重要的是,检测器组件本身在算法层面保持透明,从而使得常规行为与异常行为的分类对人类具有可解释性。我们的方法实现了高达约96%的检测精确率和约98%的召回率。与最先进的异常检测技术相比,该方法仅需极小的计算开销(相对于非监督推理时间低至0.3%),并有助于提升模型的可解释性。