Topology-aware Generalization of Decentralized SGD - 专知论文

会员服务 ·

0

泛化理论 · SGD · HTTPS · 随机梯度下降 · 相关系数 ·

2023 年 2 月 4 日

Topology-aware Generalization of Decentralized SGD

翻译：拓扑感知的分散式SGD泛化研究

Tongtian Zhu,Fengxiang He,Lan Zhang,Zhengyang Niu,Mingli Song,Dacheng Tao

from arxiv, Accepted for publication in the 39th International Conference on Machine Learning (ICML 2022)

This paper studies the algorithmic stability and generalizability of decentralized stochastic gradient descent (D-SGD). We prove that the consensus model learned by D-SGD is $\mathcal{O}{(N^{-1}+m^{-1} +\lambda^2)}$-stable in expectation in the non-convex non-smooth setting, where $N$ is the total sample size, $m$ is the worker number, and $1+\lambda$ is the spectral gap that measures the connectivity of the communication topology. These results then deliver an $\mathcal{O}{(N^{-(1+\alpha)/2}+ m^{-(1+\alpha)/2}+\lambda^{1+\alpha} + \phi_{\mathcal{S}})}$ in-average generalization bound, which is non-vacuous even when $\lambda$ is closed to $1$, in contrast to vacuous as suggested by existing literature on the projected version of D-SGD. Our theory indicates that the generalizability of D-SGD is positively correlated with the spectral gap, and can explain why consensus control in initial training phase can ensure better generalization. Experiments of VGG-11 and ResNet-18 on CIFAR-10, CIFAR-100 and Tiny-ImageNet justify our theory. To our best knowledge, this is the first work on the topology-aware generalization of vanilla D-SGD. Code is available at https://github.com/Raiden-Zhu/Generalization-of-DSGD.

翻译：本文研究分散式随机梯度下降（D-SGD）的算法稳定性与泛化能力。我们证明在非凸非光滑设定下，D-SGD习得的共识模型在期望意义上是 $\mathcal{O}{(N^{-1}+m^{-1} +\lambda^2)}$-稳定的，其中 $N$ 为总样本量，$m$ 为工作者数量，$1+\lambda$ 为衡量通信拓扑连通性的谱间隙。这些结果进一步推导出 $\mathcal{O}{(N^{-(1+\alpha)/2}+ m^{-(1+\alpha)/2}+\lambda^{1+\alpha} + \phi_{\mathcal{S}})}$ 的平均泛化界，即使当 $\lambda$ 接近1时该界仍非平凡——这与现有文献关于投影版D-SGD的平凡结论形成对比。我们的理论表明D-SGD的泛化能力与谱间隙正相关，并可解释为何初始训练阶段的共识控制能确保更优泛化性能。基于VGG-11和ResNet-18在CIFAR-10、CIFAR-100和Tiny-ImageNet上的实验验证了我们的理论。据我们所知，这是首个针对原始版D-SGD的拓扑感知泛化工作。代码开源于 https://github.com/Raiden-Zhu/Generalization-of-DSGD。

0

相关内容

泛化理论

【2022新书】高效深度学习，Efficient Deep Learning Book

【2022新书】高效深度学习，Efficient Deep Learning Book

专知会员服务

128+阅读 · 2022年4月21日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

60+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

80+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

84+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

106+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

免疫调控蛋白ABIN1抑制TNF诱导细胞凋亡的分子机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

S@TiO2纳米颗粒/纳米管正极材料的设计合成及其固硫机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

新型微孔金属-有机膦酸材料的合成及催化性能研究

国家自然科学基金

0+阅读 · 2012年12月31日

熔盐电解可控制备纳米半导体(Si, Ge)粉体的基础研究

国家自然科学基金

0+阅读 · 2012年12月31日

两个WD40转录因子对银杏类黄酮生物合成调控的研究

国家自然科学基金

0+阅读 · 2012年12月31日

金属-有机骨架化合物（MOFs）的手性后合成修饰及不对称催化研究

国家自然科学基金

0+阅读 · 2012年12月31日

无溶剂合成介孔硅铝催化材料

国家自然科学基金

0+阅读 · 2012年12月31日

一氧化氮调控铝诱导花生根尖细胞程序性死亡机理的研究

国家自然科学基金

0+阅读 · 2012年12月31日

新型多元铟硫属化合物的溶剂热合成及性能研究

国家自然科学基金

0+阅读 · 2009年12月31日

新型手性N-Oxide金属化合物的合成与催化研究

国家自然科学基金

0+阅读 · 2008年12月31日

Statistical Inference with Stochastic Gradient Methods under $φ$-mixing Data

Arxiv

0+阅读 · 2023年3月28日

CoDeC: Communication-Efficient Decentralized Continual Learning

Arxiv

0+阅读 · 2023年3月27日

A New Family of Generalization Bounds Using Samplewise Evaluated CMI

Arxiv

0+阅读 · 2023年3月27日

Risk-aware linear bandits with convex loss

Arxiv

0+阅读 · 2023年3月27日

On Generalization of Decentralized Learning with Separable Data

Arxiv

0+阅读 · 2023年3月27日

On the tightness of information-theoretic bounds on generalization error of learning algorithms

Arxiv

0+阅读 · 2023年3月26日

IMA-GNN: In-Memory Acceleration of Centralized and Decentralized Graph Neural Networks at the Edge

IMA-GNN: In-Memory Acceleration of Centralized and Decentralized Graph Neural Networks at the Edge

Arxiv

0+阅读 · 2023年3月24日

The limited-memory recursive variational Gaussian approximation (L-RVGA)

Arxiv

0+阅读 · 2023年3月24日

Efficient decentralized multi-agent learning in asymmetric bipartite queueing systems

Arxiv

0+阅读 · 2023年3月23日

Adversarial Robustness of Representation Learning for Knowledge Graphs

Arxiv

10+阅读 · 2022年9月30日

VIP会员

文章信息

相关主题

随机梯度下降

最新内容

《反无人机蜂群：有人-无人协同防御场景下的编队重构分析》

《反无人机蜂群：有人-无人协同防御场景下的编队重构分析》

专知会员服务

4+阅读 · 今天12:53

《史诗怒火/咆哮雄狮行动：针对伊朗空中战役的战略分析》68页智库报告

《史诗怒火/咆哮雄狮行动：针对伊朗空中战役的战略分析》68页智库报告

专知会员服务

3+阅读 · 今天12:39

“愈演愈烈的欺骗与干扰博弈”：无人机与人工智能背景下俄乌强化以无人机为核心的电子战

“愈演愈烈的欺骗与干扰博弈”：无人机与人工智能背景下俄乌强化以无人机为核心的电子战

专知会员服务

2+阅读 · 今天12:32

乌克兰纵深打击如何重塑俄罗斯的战略选择

乌克兰纵深打击如何重塑俄罗斯的战略选择

专知会员服务

1+阅读 · 今天12:25

《分布式太空任务对比分析与综合建模及仿真环境》120页

《分布式太空任务对比分析与综合建模及仿真环境》120页

专知会员服务

1+阅读 · 今天12:14

俄乌战争中关于中程打击无人机部署的经验启示

俄乌战争中关于中程打击无人机部署的经验启示

专知会员服务

0+阅读 · 今天12:08

《远程自主系统可扩展态势感知的解决方案》32页2026最新报告

《远程自主系统可扩展态势感知的解决方案》32页2026最新报告

专知会员服务

5+阅读 · 7月23日

《基于强化学习的自动化红队测试》

《基于强化学习的自动化红队测试》

专知会员服务

4+阅读 · 7月23日

《下一代无人机-卫星通信：人工智能创新与未来展望》32页长综述

《下一代无人机-卫星通信：人工智能创新与未来展望》32页长综述

专知会员服务

6+阅读 · 7月23日

“天降毒雾”：无人机如何使化学战重返乌克兰战场

“天降毒雾”：无人机如何使化学战重返乌克兰战场

专知会员服务

2+阅读 · 7月23日

伊朗不对称防空战略的演进

伊朗不对称防空战略的演进

专知会员服务

4+阅读 · 7月23日

对抗环境下超视距目标打击的情报支援

对抗环境下超视距目标打击的情报支援

专知会员服务

10+阅读 · 7月22日

《面向复杂地形下无人机跟踪地面机器人（UAV–UGV）的自适应多滤波器扩展卡尔曼滤波框架》

《面向复杂地形下无人机跟踪地面机器人（UAV–UGV）的自适应多滤波器扩展卡尔曼滤波框架》

专知会员服务

4+阅读 · 7月22日

纵深侦察：大规模作战行动中远程侦察与监视之迫切需求

纵深侦察：大规模作战行动中远程侦察与监视之迫切需求

专知会员服务

8+阅读 · 7月22日

共享认知，分布式研判：复杂行动中的美国空军指挥控制（万字长文）

共享认知，分布式研判：复杂行动中的美国空军指挥控制（万字长文）

专知会员服务

11+阅读 · 7月22日

相关VIP内容

【2022新书】高效深度学习，Efficient Deep Learning Book

【2022新书】高效深度学习，Efficient Deep Learning Book

专知会员服务

128+阅读 · 2022年4月21日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

60+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

80+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

84+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

106+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《史诗怒火/咆哮雄狮行动：针对伊朗空中战役的战略分析》68页智库报告

乌克兰纵深打击如何重塑俄罗斯的战略选择

《反无人机蜂群：有人-无人协同防御场景下的编队重构分析》

“愈演愈烈的欺骗与干扰博弈”：无人机与人工智能背景下俄乌强化以无人机为核心的电子战

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

相关论文

Statistical Inference with Stochastic Gradient Methods under $φ$-mixing Data

Arxiv

0+阅读 · 2023年3月28日

CoDeC: Communication-Efficient Decentralized Continual Learning

Arxiv

0+阅读 · 2023年3月27日

A New Family of Generalization Bounds Using Samplewise Evaluated CMI

Arxiv

0+阅读 · 2023年3月27日

Risk-aware linear bandits with convex loss

Arxiv

0+阅读 · 2023年3月27日

On Generalization of Decentralized Learning with Separable Data

Arxiv

0+阅读 · 2023年3月27日

On the tightness of information-theoretic bounds on generalization error of learning algorithms

Arxiv

0+阅读 · 2023年3月26日

IMA-GNN: In-Memory Acceleration of Centralized and Decentralized Graph Neural Networks at the Edge

IMA-GNN: In-Memory Acceleration of Centralized and Decentralized Graph Neural Networks at the Edge

Arxiv

0+阅读 · 2023年3月24日

The limited-memory recursive variational Gaussian approximation (L-RVGA)

Arxiv

0+阅读 · 2023年3月24日

Efficient decentralized multi-agent learning in asymmetric bipartite queueing systems

Arxiv

0+阅读 · 2023年3月23日

Adversarial Robustness of Representation Learning for Knowledge Graphs

Arxiv

10+阅读 · 2022年9月30日

相关基金

免疫调控蛋白ABIN1抑制TNF诱导细胞凋亡的分子机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

S@TiO2纳米颗粒/纳米管正极材料的设计合成及其固硫机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

新型微孔金属-有机膦酸材料的合成及催化性能研究

国家自然科学基金

0+阅读 · 2012年12月31日

熔盐电解可控制备纳米半导体(Si, Ge)粉体的基础研究

国家自然科学基金

0+阅读 · 2012年12月31日

两个WD40转录因子对银杏类黄酮生物合成调控的研究

国家自然科学基金

0+阅读 · 2012年12月31日

金属-有机骨架化合物（MOFs）的手性后合成修饰及不对称催化研究

国家自然科学基金

0+阅读 · 2012年12月31日

无溶剂合成介孔硅铝催化材料

国家自然科学基金

0+阅读 · 2012年12月31日

一氧化氮调控铝诱导花生根尖细胞程序性死亡机理的研究

国家自然科学基金

0+阅读 · 2012年12月31日

新型多元铟硫属化合物的溶剂热合成及性能研究

国家自然科学基金

0+阅读 · 2009年12月31日

新型手性N-Oxide金属化合物的合成与催化研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员