Deep learning is computationally intensive, with significant efforts focused on reducing arithmetic complexity, particularly regarding energy consumption dominated by data movement. While existing literature emphasizes inference, training is considerably more resource-intensive. This paper proposes a novel mathematical principle by introducing the notion of Boolean variation such that neurons made of Boolean weights and inputs can be trained -- for the first time -- efficiently in Boolean domain using Boolean logic instead of gradient descent and real arithmetic. We explore its convergence, conduct extensively experimental benchmarking, and provide consistent complexity evaluation by considering chip architecture, memory hierarchy, dataflow, and arithmetic precision. Our approach achieves baseline full-precision accuracy in ImageNet classification and surpasses state-of-the-art results in semantic segmentation, with notable performance in image super-resolution, and natural language understanding with transformer-based models. Moreover, it significantly reduces energy consumption during both training and inference.
翻译:深度学习计算密集,大量研究致力于降低算术复杂度,尤其是针对数据移动主导的能耗问题。现有文献多关注推理过程,但训练阶段的资源消耗更为显著。本文提出一种新颖的数学原理,通过引入布尔变分概念,首次实现了由布尔权重和输入构成的神经元在布尔域中高效训练——使用布尔逻辑替代梯度下降与实数运算。我们探究了其收敛性,开展了广泛的实验基准测试,并结合芯片架构、内存层次、数据流及算术精度等因素提供了系统的复杂度评估。该方法在ImageNet分类任务中达到全精度基准准确率,在语义分割任务中超越现有最优结果,在图像超分辨率及基于Transformer模型的自然语言理解任务中表现突出。此外,该方法在训练和推理阶段均显著降低了能耗。