Adaptive Message Quantization and Parallelization for Distributed Full-graph GNN Training - 专知论文

会员服务 ·

0

图 · GNN · 边缘化 · Analysis · 模型评估 ·

2023 年 6 月 2 日

Adaptive Message Quantization and Parallelization for Distributed Full-graph GNN Training

翻译：分布式全图GNN训练的自适应消息量化与并行化

Borui Wan,Juntao Zhao,Chuan Wu

Distributed full-graph training of Graph Neural Networks (GNNs) over large graphs is bandwidth-demanding and time-consuming. Frequent exchanges of node features, embeddings and embedding gradients (all referred to as messages) across devices bring significant communication overhead for nodes with remote neighbors on other devices (marginal nodes) and unnecessary waiting time for nodes without remote neighbors (central nodes) in the training graph. This paper proposes an efficient GNN training system, AdaQP, to expedite distributed full-graph GNN training. We stochastically quantize messages transferred across devices to lower-precision integers for communication traffic reduction and advocate communication-computation parallelization between marginal nodes and central nodes. We provide theoretical analysis to prove fast training convergence (at the rate of O(T^{-1}) with T being the total number of training epochs) and design an adaptive quantization bit-width assignment scheme for each message based on the analysis, targeting a good trade-off between training convergence and efficiency. Extensive experiments on mainstream graph datasets show that AdaQP substantially improves distributed full-graph training's throughput (up to 3.01 X) with negligible accuracy drop (at most 0.30%) or even accuracy improvement (up to 0.19%) in most cases, showing significant advantages over the state-of-the-art works.

翻译：图神经网络（GNN）在大规模图上的分布式全图训练具有高带宽消耗和长时间特性。训练图中，设备间频繁交换节点特征、嵌入及嵌入梯度（统称为消息）会导致：对于在其他设备上有远程邻居的节点（边缘节点）产生显著通信开销，而对于无远程邻居的节点（中心节点）则带来不必要的等待时间。本文提出高效GNN训练系统AdaQP，通过随机量化设备间传输的消息为低精度整数以减少通信流量，并倡导边缘节点与中心节点间的通信-计算并行化。我们提供理论分析证明快速训练收敛性（收敛速率为O(T^{-1})，其中T为总训练轮数），并基于该分析设计自适应量化位宽分配方案以优化训练收敛性与效率的权衡。在主流图数据集上的大量实验表明，AdaQP在多数情况下将分布式全图训练的吞吐量提升高达3.01倍，同时精度损失不超过0.30%甚至提升0.19%，显著优于现有最先进工作。

0

相关内容

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

专知会员服务

246+阅读 · 2019年10月21日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

32+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

GNN 新基准！Long Range Graph Benchmark

GNN 新基准！Long Range Graph Benchmark

图与推荐

0+阅读 · 2022年10月18日

ICLR'23截稿, 图神经网络依然火热 (附42 篇好文整理)

ICLR'23截稿, 图神经网络依然火热 (附42 篇好文整理)

图与推荐

2+阅读 · 2022年10月5日

灾难性遗忘问题新视角：迁移-干扰平衡

灾难性遗忘问题新视角：迁移-干扰平衡

CreateAMind

17+阅读 · 2019年7月6日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

罗巴代数的表示和罗巴代数在operad中的应用

国家自然科学基金

0+阅读 · 2015年12月31日

Peroxidasin信号通路在先天性白内障发病中的分子机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

奶牛乳腺脂类合成代谢转录调控机制与基因网络构建

国家自然科学基金

0+阅读 · 2014年12月31日

压缩感知与稀疏信号恢复

国家自然科学基金

2+阅读 · 2014年12月31日

Notch信号通路参与家蚕胚胎发育分子机制的研究

国家自然科学基金

0+阅读 · 2013年12月31日

益气活血法对大鼠萎缩性胃炎Hedgehog信号通路的调控机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

HO-1基因过表达阻断MLK3/MKK7/JNK信号通路对脊髓损伤保护作用的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

水稻细菌性基腐病菌zeamine毒素基因簇的克隆与功能分析

国家自然科学基金

0+阅读 · 2012年12月31日

Hedgehog信号通路调控宫颈癌上皮间质转化的作用及机制

国家自然科学基金

0+阅读 · 2011年12月31日

淫羊藿总黄酮调控骨性关节炎p38MAPK信号转导通路的研究

国家自然科学基金

0+阅读 · 2010年12月31日

On Privileged and Convergent Bases in Neural Network Representations

Arxiv

0+阅读 · 2023年7月24日

Robust Fully-Asynchronous Methods for Distributed Training over General Architecture

Arxiv

0+阅读 · 2023年7月21日

Parallel and Distributed Graph Neural Networks: An In-Depth Concurrency Analysis

Arxiv

0+阅读 · 2023年7月21日

Distributed Graph Neural Network Training: A Survey

Arxiv

16+阅读 · 2022年11月1日

Bayesian Deep Learning for Graphs

Arxiv

23+阅读 · 2022年2月24日

Cross-Node Federated Graph Neural Network for Spatio-Temporal Data Modeling

Arxiv

17+阅读 · 2021年6月9日

Unsupervised Multi-Source Domain Adaptation for Person Re-Identification

Arxiv

14+阅读 · 2021年4月27日

Graph Neural Networks with Heterophily

Arxiv

19+阅读 · 2021年2月4日

Distributed Graph Convolutional Networks

Arxiv

19+阅读 · 2020年7月13日

Cluster-GCN: An Efficient Algorithm for Training Deep and Large Graph Convolutional Networks

Arxiv

14+阅读 · 2019年8月8日

VIP会员

文章信息

相关主题

最新内容

反无人机拦截器训练与运用课程：对美国陆军部队发展的启示

反无人机拦截器训练与运用课程：对美国陆军部队发展的启示

专知会员服务

2+阅读 · 53分钟前

重新思考无人机时代的生存能力

重新思考无人机时代的生存能力

专知会员服务

0+阅读 · 今天7:44

装甲突击旅：现代战争思考、战斗与组织

装甲突击旅：现代战争思考、战斗与组织

专知会员服务

0+阅读 · 今天7:28

在人工智能加速决策环境中拓展OODA循环

在人工智能加速决策环境中拓展OODA循环

专知会员服务

1+阅读 · 今天7:18

《廉价自杀式无人机战争的军事战略影响：乌克兰与伊朗案例研究》

《廉价自杀式无人机战争的军事战略影响：乌克兰与伊朗案例研究》

专知会员服务

1+阅读 · 今天7:07

军事欺骗：供作战战术指挥官使用的工具

军事欺骗：供作战战术指挥官使用的工具

专知会员服务

1+阅读 · 今天7:03

ICML 2026 | CFPO：用反事实策略优化提升多模态推理

ICML 2026 | CFPO：用反事实策略优化提升多模态推理

专知会员服务

3+阅读 · 6月23日

综述 | 世界动作模型：少做梦，多行动

综述 | 世界动作模型：少做梦，多行动

专知会员服务

4+阅读 · 6月23日

美以伊冲突：无人机与人工智能的运用

美以伊冲突：无人机与人工智能的运用

专知会员服务

7+阅读 · 6月23日

《战时图神经网络：整合以色列-伊朗冲突中的网络安全与无人机智能》最新50页文献

《战时图神经网络：整合以色列-伊朗冲突中的网络安全与无人机智能》最新50页文献

专知会员服务

3+阅读 · 6月23日

《特种部队在透明战场中的生存力》最新报告

《特种部队在透明战场中的生存力》最新报告

专知会员服务

4+阅读 · 6月23日

《自主无人机蜂群协同与控制系统：人工智能赋能的战场协同与自主任务编排平台》

《自主无人机蜂群协同与控制系统：人工智能赋能的战场协同与自主任务编排平台》

专知会员服务

7+阅读 · 6月23日

《人工智能生成的零日漏洞：对未来作战的影响》

《人工智能生成的零日漏洞：对未来作战的影响》

专知会员服务

5+阅读 · 6月23日

《理解伙伴国在防务能力选择中的偏好：探索美国解决方案的替代选择》美智库200页报告

《理解伙伴国在防务能力选择中的偏好：探索美国解决方案的替代选择》美智库200页报告

专知会员服务

3+阅读 · 6月23日

ICML 2026 | 边界嵌入塑形：用自适应对比学习破解图结构纠缠

ICML 2026 | 边界嵌入塑形：用自适应对比学习破解图结构纠缠

专知会员服务

6+阅读 · 6月22日

相关VIP内容

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

专知会员服务

246+阅读 · 2019年10月21日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

32+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

重新思考无人机时代的生存能力

在人工智能加速决策环境中拓展OODA循环

反无人机拦截器训练与运用课程：对美国陆军部队发展的启示

装甲突击旅：现代战争思考、战斗与组织

相关资讯

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

GNN 新基准！Long Range Graph Benchmark

GNN 新基准！Long Range Graph Benchmark

图与推荐

0+阅读 · 2022年10月18日

ICLR'23截稿, 图神经网络依然火热 (附42 篇好文整理)

ICLR'23截稿, 图神经网络依然火热 (附42 篇好文整理)

图与推荐

2+阅读 · 2022年10月5日

灾难性遗忘问题新视角：迁移-干扰平衡

灾难性遗忘问题新视角：迁移-干扰平衡

CreateAMind

17+阅读 · 2019年7月6日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

On Privileged and Convergent Bases in Neural Network Representations

Arxiv

0+阅读 · 2023年7月24日

Robust Fully-Asynchronous Methods for Distributed Training over General Architecture

Arxiv

0+阅读 · 2023年7月21日

Parallel and Distributed Graph Neural Networks: An In-Depth Concurrency Analysis

Arxiv

0+阅读 · 2023年7月21日

Distributed Graph Neural Network Training: A Survey

Arxiv

16+阅读 · 2022年11月1日

Bayesian Deep Learning for Graphs

Arxiv

23+阅读 · 2022年2月24日

Cross-Node Federated Graph Neural Network for Spatio-Temporal Data Modeling

Arxiv

17+阅读 · 2021年6月9日

Unsupervised Multi-Source Domain Adaptation for Person Re-Identification

Arxiv

14+阅读 · 2021年4月27日

Graph Neural Networks with Heterophily

Arxiv

19+阅读 · 2021年2月4日

Distributed Graph Convolutional Networks

Arxiv

19+阅读 · 2020年7月13日

Cluster-GCN: An Efficient Algorithm for Training Deep and Large Graph Convolutional Networks

Arxiv

14+阅读 · 2019年8月8日

相关基金

罗巴代数的表示和罗巴代数在operad中的应用

国家自然科学基金

0+阅读 · 2015年12月31日

Peroxidasin信号通路在先天性白内障发病中的分子机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

奶牛乳腺脂类合成代谢转录调控机制与基因网络构建

国家自然科学基金

0+阅读 · 2014年12月31日

压缩感知与稀疏信号恢复

国家自然科学基金

2+阅读 · 2014年12月31日

Notch信号通路参与家蚕胚胎发育分子机制的研究

国家自然科学基金

0+阅读 · 2013年12月31日

益气活血法对大鼠萎缩性胃炎Hedgehog信号通路的调控机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

HO-1基因过表达阻断MLK3/MKK7/JNK信号通路对脊髓损伤保护作用的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

水稻细菌性基腐病菌zeamine毒素基因簇的克隆与功能分析

国家自然科学基金

0+阅读 · 2012年12月31日

Hedgehog信号通路调控宫颈癌上皮间质转化的作用及机制

国家自然科学基金

0+阅读 · 2011年12月31日

淫羊藿总黄酮调控骨性关节炎p38MAPK信号转导通路的研究

国家自然科学基金

0+阅读 · 2010年12月31日

微信扫码咨询专知VIP会员