We explore calibration properties at various precisions for three architectures: ShuffleNetv2, GhostNet-VGG, and MobileOne; and two datasets: CIFAR-100 and PathMNIST. The quality of calibration is observed to track the quantization quality; it is well-documented that performance worsens with lower precision, and we observe a similar correlation with poorer calibration. This becomes especially egregious at 4-bit activation regime. GhostNet-VGG is shown to be the most robust to overall performance drop at lower precision. We find that temperature scaling can improve calibration error for quantized networks, with some caveats. We hope that these preliminary insights can lead to more opportunities for explainable and reliable EdgeML.
翻译:我们探索了三种架构(ShuffleNetv2、GhostNet-VGG和MobileOne)在两种数据集(CIFAR-100和PathMNIST)上不同精度下的校准特性。观察发现,校准质量与量化质量呈正相关;已有充分文献表明,性能随精度降低而恶化,我们观察到校准质量也呈现类似相关性。这一问题在4位激活机制下尤为严重。实验表明,GhostNet-VGG在低精度下对整体性能下降具有最强的鲁棒性。研究发现,温度缩放(temperature scaling)可在特定条件下改善量化网络的校准误差。我们希望这些初步发现能为可解释且可靠的边缘机器学习(EdgeML)带来更多发展机遇。