Binary Neural Networks (BNNs) have emerged as a promising solution for reducing the memory footprint and compute costs of deep neural networks, but they suffer from quality degradation due to the lack of freedom as activations and weights are constrained to the binary values. To compensate for the accuracy drop, we propose a novel BNN design called Binary Neural Network with INSTAnce-aware threshold (INSTA-BNN), which controls the quantization threshold dynamically in an input-dependent or instance-aware manner. According to our observation, higher-order statistics can be a representative metric to estimate the characteristics of the input distribution. INSTA-BNN is designed to adjust the threshold dynamically considering various information, including higher-order statistics, but it is also optimized judiciously to realize minimal overhead on a real device. Our extensive study shows that INSTA-BNN outperforms the baseline by 3.0% and 2.8% on the ImageNet classification task with comparable computing cost, achieving 68.5% and 72.2% top-1 accuracy on ResNet-18 and MobileNetV1 based models, respectively.
翻译:二值神经网络(BNN)通过将激活值和权重约束为二值,显著降低了深度神经网络的存储开销与计算成本,但其因缺乏自由度而面临性能退化问题。为补偿精度损失,我们提出一种新型BNN架构——具备实例感知阈值的二值神经网络(INSTA-BNN),该网络以输入依赖或实例感知的方式动态控制量化阈值。基于观测发现,高阶统计量可作为衡量输入分布特征的代表性指标。INSTA-BNN被设计为能够动态调整阈值,综合考量包括高阶统计量在内的多种信息,同时经过审慎优化以在真实设备上实现最小化开销。大量实验表明,在ImageNet分类任务中,INSTA-BNN以相当的计算成本分别将基于ResNet-18和MobileNetV1的模型提升3.0%和2.8%的基线性能,实现68.5%和72.2%的Top-1准确率。