Graph Neural Networks (GNNs) are neural networks that aim to process graph data, capturing the relationships and interactions between nodes using the message-passing mechanism. GNN quantization has emerged as a promising approach for reducing model size and accelerating inference in resource-constrained environments. Compared to quantization in LLMs, quantizing graph features is more emphasized in GNNs. Inspired by the above, we propose to leverage prompt learning, which manipulates the input data, to improve the performance of quantization-aware training (QAT) for GNNs. To mitigate the issue that prompting the node features alone can only make part of the quantized aggregation result optimal, we introduce Low-Rank Aggregation Prompting (LoRAP), which injects lightweight, input-dependent prompts into each aggregated feature to optimize the results of quantized aggregations. Extensive evaluations on 4 leading QAT frameworks over 9 graph datasets demonstrate that LoRAP consistently enhances the performance of low-bit quantized GNNs while introducing a minimal computational overhead.
翻译:图神经网络(GNNs)是一种旨在处理图数据的神经网络,它利用消息传递机制捕获节点之间的关系与交互。GNN量化已成为一种在资源受限环境中减小模型规模、加速推理的有效方法。与大型语言模型(LLMs)中的量化相比,图特征的量化在GNN中受到更多关注。受此启发,我们提出利用提示学习——一种对输入数据进行操作的技术——来提升GNN量化感知训练(QAT)的性能。为缓解仅对节点特征进行提示只能使部分量化聚合结果最优的问题,我们引入了低秩聚合提示(LoRAP),该方法将轻量级、输入依赖的提示注入每个聚合特征中,以优化量化聚合的结果。在9个图数据集上对4种主流QAT框架进行的广泛评估表明,LoRAP能够持续提升低位量化GNN的性能,同时仅引入极小的计算开销。