Network binarization emerges as one of the most promising compression approaches offering extraordinary computation and memory savings by minimizing the bit-width. However, recent research has shown that applying existing binarization algorithms to diverse tasks, architectures, and hardware in realistic scenarios is still not straightforward. Common challenges of binarization, such as accuracy degradation and efficiency limitation, suggest that its attributes are not fully understood. To close this gap, we present BiBench, a rigorously designed benchmark with in-depth analysis for network binarization. We first carefully scrutinize the requirements of binarization in the actual production and define evaluation tracks and metrics for a comprehensive and fair investigation. Then, we evaluate and analyze a series of milestone binarization algorithms that function at the operator level and with extensive influence. Our benchmark reveals that 1) the binarized operator has a crucial impact on the performance and deployability of binarized networks; 2) the accuracy of binarization varies significantly across different learning tasks and neural architectures; 3) binarization has demonstrated promising efficiency potential on edge devices despite the limited hardware support. The results and analysis also lead to a promising paradigm for accurate and efficient binarization. We believe that BiBench will contribute to the broader adoption of binarization and serve as a foundation for future research.
翻译:网络二值化作为最具前景的压缩方法之一,通过最小化位宽实现了显著的计算和存储节省。然而,近期研究表明,在真实场景中将现有二值化算法应用于多样化任务、架构和硬件仍非易事。二值化面临的常见挑战(如精度下降和效率限制)表明其特性尚未被完全理解。为弥合这一差距,我们提出BiBench——一个经过严格设计、包含深度分析的网络二值化基准。首先,我们仔细审视实际生产中二值化的需求,并定义评估轨迹和指标以实现全面且公平的研究。随后,我们评估并分析一系列在算子级别运作且影响广泛的里程碑式二值化算法。我们的基准揭示:1)二值化算子对二值网络的性能与可部署性具有关键影响;2)二值化精度在不同学习任务和神经架构间存在显著差异;3)尽管硬件支持有限,二值化在边缘设备上已展现出前景可观的效率潜力。这些结果与分析还引出了一条实现精确高效二值化的有前景范式。我们相信BiBench将推动二值化的更广泛应用,并为未来研究奠定基础。