We present, QP-SBGD, a novel layer-wise stochastic optimiser tailored towards training neural networks with binary weights, known as binary neural networks (BNNs), on quantum hardware. BNNs reduce the computational requirements and energy consumption of deep learning models with minimal loss in accuracy. However, training them in practice remains to be an open challenge. Most known BNN-optimisers either rely on projected updates or binarise weights post-training. Instead, QP-SBGD approximately maps the gradient onto binary variables, by solving a quadratic constrained binary optimisation. Under practically reasonable assumptions, we show that this update rule converges with a rate of $\mathcal{O}(1 / \sqrt{T})$. Moreover, we show how the $\mathcal{NP}$-hard projection can be effectively executed on an adiabatic quantum annealer, harnessing recent advancements in quantum computation. We also introduce a projected version of this update rule and prove that if a fixed point exists in the binary variable space, the modified updates will converge to it. Last but not least, our algorithm is implemented layer-wise, making it suitable to train larger networks on resource-limited quantum hardware. Through extensive evaluations, we show that QP-SBGD outperforms or is on par with competitive and well-established baselines such as BinaryConnect, signSGD and ProxQuant when optimising the Rosenbrock function, training BNNs as well as binary graph neural networks.
翻译:我们提出了一种新颖的逐层随机优化器 QP-SBGD,专门用于在量子硬件上训练具有二值权重的神经网络(即二值神经网络 BNN)。BNN 在最小化精度损失的前提下,降低了深度学习模型的计算需求和能耗。然而,其实用化训练仍是一个未解难题。大多数已知的 BNN 优化器要么依赖投影更新,要么在训练后对权重进行二值化处理。与之不同,QP-SBGD 通过求解二次约束二值优化问题,将梯度近似映射到二值变量上。在实际合理假设下,我们证明该更新规则以 $\mathcal{O}(1 / \sqrt{T})$ 的速率收敛。此外,我们展示了如何利用量子计算的最新进展,在绝热量子退火器上有效执行 $\mathcal{NP}$-难的投影操作。我们还引入了该更新规则的投影版本,并证明若二值变量空间中存在不动点,则修正后的更新将收敛至该点。最后但同样重要的是,我们的算法采用逐层实现,适用于在资源受限的量子硬件上训练更大规模网络。通过广泛评估,我们证明在优化 Rosenbrock 函数、训练 BNN 以及二值图神经网络时,QP-SBGD 在性能上优于或与 BinaryConnect、signSGD 和 ProxQuant 等竞争性且成熟的基线方法相当。