Deploying 3D graph neural networks (GNNs) that are equivariant to 3D rotations (the group SO(3)) on edge devices is challenging due to their high computational cost. This paper addresses the problem by compressing and accelerating an SO(3)-equivariant GNN using low-bit quantization techniques. Specifically, we introduce three innovations for quantized equivariant transformers: (1) a magnitude-direction decoupled quantization scheme that separately quantizes the norm and orientation of equivariant (vector) features, (2) a branch-separated quantization-aware training strategy that treats invariant and equivariant feature channels differently in an attention-based $SO(3)$-GNN, and (3) a robustness-enhancing attention normalization mechanism that stabilizes low-precision attention computations. Experiments on the QM9 and rMD17 molecular benchmarks demonstrate that our 8-bit models achieve accuracy on energy and force predictions comparable to full-precision baselines with markedly improved efficiency. We also conduct ablation studies to quantify the contribution of each component to maintain accuracy and equivariance under quantization, using the Local error of equivariance (LEE) metric. The proposed techniques enable the deployment of symmetry-aware GNNs in practical chemistry applications with 2.37--2.73x faster inference and 4x smaller model size, without sacrificing accuracy or physical symmetry.
翻译:在边缘设备上部署对三维旋转(群SO(3))具有等变性的三维图神经网络(GNNs)因其高计算成本而面临挑战。本文通过使用低位量化技术压缩和加速SO(3)-等变GNN来解决此问题。具体而言,我们为量化等变Transformer引入了三项创新:(1)一种幅值-方向解耦合量化方案,分别量化等变(向量)特征的范数和方向;(2)一种分支分离的量化感知训练策略,在基于注意力的$SO(3)$-GNN中以不同方式处理不变和等变特征通道;(3)一种增强鲁棒性的注意力归一化机制,用于稳定低精度注意力计算。在QM9和rMD17分子基准测试上的实验表明,我们的8位模型在能量和力预测上的精度与全精度基线相当,同时效率显著提升。我们还通过消融研究,使用等变性局部误差(LEE)度量,量化了每个组件在量化条件下对维持精度和等变性的贡献。所提出的技术使得对称感知GNN能够在实际化学应用中部署,推理速度提升2.37--2.73倍,模型尺寸减小4倍,且不牺牲精度或物理对称性。