Solving Oscillation Problem in Post-Training Quantization Through a Theoretical Perspective - 专知论文

会员服务 ·

0

震荡 · 联合优化 · 性能下降 · 数据依赖 · 广义 ·

2023 年 4 月 4 日

Solving Oscillation Problem in Post-Training Quantization Through a Theoretical Perspective

翻译：通过理论视角解决训练后量化中的振荡问题

Yuexiao Ma,Huixia Li,Xiawu Zheng,Xuefeng Xiao,Rui Wang,Shilei Wen,Xin Pan,Fei Chao,Rongrong Ji

from arxiv, Accepted by CVPR 2023

Post-training quantization (PTQ) is widely regarded as one of the most efficient compression methods practically, benefitting from its data privacy and low computation costs. We argue that an overlooked problem of oscillation is in the PTQ methods. In this paper, we take the initiative to explore and present a theoretical proof to explain why such a problem is essential in PTQ. And then, we try to solve this problem by introducing a principled and generalized framework theoretically. In particular, we first formulate the oscillation in PTQ and prove the problem is caused by the difference in module capacity. To this end, we define the module capacity (ModCap) under data-dependent and data-free scenarios, where the differentials between adjacent modules are used to measure the degree of oscillation. The problem is then solved by selecting top-k differentials, in which the corresponding modules are jointly optimized and quantized. Extensive experiments demonstrate that our method successfully reduces the performance drop and is generalized to different neural networks and PTQ methods. For example, with 2/4 bit ResNet-50 quantization, our method surpasses the previous state-of-the-art method by 1.9%. It becomes more significant on small model quantization, e.g. surpasses BRECQ method by 6.61% on MobileNetV2*0.5.

翻译：训练后量化（PTQ）被广泛认为是最实用的压缩方法之一，得益于其数据隐私保护和低计算成本。我们认为PTQ方法中存在一个被忽视的振荡问题。本文率先探索并提出了理论证明，解释了该问题在PTQ中的重要性。随后，我们引入了一个有原则且通用的理论框架来解决该问题。具体而言，我们首先对PTQ中的振荡进行了形式化描述，并证明该问题由模块容量差异引起。为此，我们定义了数据依赖和数据无关场景下的模块容量（ModCap），其中相邻模块的差异用于衡量振荡程度。通过选择top-k差异值来解决问题，相应的模块被联合优化和量化。大量实验表明，我们的方法成功减少了性能下降，并泛化到不同的神经网络和PTQ方法。例如，在2/4位ResNet-50量化中，我们的方法比先前最先进的方法提升了1.9%。在小型模型量化中效果更为显著，例如在MobileNetV2*0.5上，该方法比BRECQ方法提升了6.61%。

0

相关内容

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

NeurIPS2021 | Cycle Self-Training：领域自适应的循环自训练方法与理论

NeurIPS2021 | Cycle Self-Training：领域自适应的循环自训练方法与理论

专知会员服务

20+阅读 · 2021年11月13日

【ICML2020】噪声在随机梯度下降中的泛化效益，On the Generalization Benefit of Noise in Stochastic Gradient Descent

【ICML2020】噪声在随机梯度下降中的泛化效益，On the Generalization Benefit of Noise in Stochastic Gradient Descent

专知会员服务

19+阅读 · 2020年6月29日

知识图谱嵌入模型的概率标定,Probability Calibration for Knowledge Graph Embedding Models

专知会员服务

36+阅读 · 2020年5月11日

【ICLR2020】用实对二进制卷积训练二进制神经网络，Training Binary Neural Networks with Real-to-Binary Convolutions

【ICLR2020】用实对二进制卷积训练二进制神经网络，Training Binary Neural Networks with Real-to-Binary Convolutions

专知会员服务

26+阅读 · 2020年3月26日

【阿里巴巴-CVPR2020】频域学习，Learning in the Frequency Domain

【阿里巴巴-CVPR2020】频域学习，Learning in the Frequency Domain

专知会员服务

29+阅读 · 2020年3月14日

【Nature交叉学科论文】机器学习在固体材料科学中的最新进展和应用，Recent advances and applications of machine learning in solidstate materials science

【Nature交叉学科论文】机器学习在固体材料科学中的最新进展和应用，Recent advances and applications of machine learning in solidstate materials science

专知会员服务

36+阅读 · 2019年12月21日

【NLP| 推荐文章】知识图谱问答系统的神经网络方法介绍（Introduction to Neural Network based Approaches for Question Answering over Knowledge Graphs）

专知会员服务

59+阅读 · 2019年11月24日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

灾难性遗忘问题新视角：迁移-干扰平衡

灾难性遗忘问题新视角：迁移-干扰平衡

CreateAMind

17+阅读 · 2019年7月6日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

大数据 | 顶级SCI期刊专刊/国际会议信息7条

大数据 | 顶级SCI期刊专刊/国际会议信息7条

Call4Papers

10+阅读 · 2018年12月29日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

随机动力系统的逼近和跑出问题

国家自然科学基金

0+阅读 · 2015年12月31日

面向大容量长距离波分复用系统的相位敏感光放大器研究

国家自然科学基金

0+阅读 · 2015年12月31日

面向异构环境的多任务多视图学习算法研究

国家自然科学基金

3+阅读 · 2014年12月31日

非高斯过程驱动系统的随机不变流形

国家自然科学基金

0+阅读 · 2013年12月31日

ThGM细胞与慢加急性肝衰竭免疫致病机制及疾病进展的关系研究

国家自然科学基金

0+阅读 · 2013年12月31日

非自治随机格点动力系统的渐近行为

国家自然科学基金

0+阅读 · 2013年12月31日

感觉神经元Ca2+激活氯电流的分子基础及Ca2+特异性选择机制

国家自然科学基金

0+阅读 · 2012年12月31日

Lé过程和分数阶Lé过程驱动的动力系统的动力学性质研究

国家自然科学基金

0+阅读 · 2012年12月31日

非冗余平移不变小波变换及其在医学图像处理中的应用

国家自然科学基金

0+阅读 · 2009年12月31日

Mather理论与Hamilton系统的不稳定性

国家自然科学基金

0+阅读 · 2008年12月31日

A Critical Reexamination of Intra-List Distance and Dispersion

Arxiv

0+阅读 · 2023年5月23日

Adversarial Ensemble Training by Jointly Learning Label Dependencies and Member Models

Arxiv

0+阅读 · 2023年5月23日

Scaling Serverless Functions in Edge Networks: A Reinforcement Learning Approach

Arxiv

0+阅读 · 2023年5月22日

Bridging Active Exploration and Uncertainty-Aware Deployment Using Probabilistic Ensemble Neural Network Dynamics

Arxiv

0+阅读 · 2023年5月20日

A Compound Gaussian Network for Solving Linear Inverse Problems

Arxiv

0+阅读 · 2023年5月19日

Post Hoc Explanations of Language Models Can Improve Language Models

Arxiv

0+阅读 · 2023年5月19日

A Comprehensive Survey on Multimodal Recommender Systems: Taxonomy, Evaluation, and Future Directions

Arxiv

16+阅读 · 2023年2月9日

Causal Inference in Recommender Systems: A Survey and Future Directions

Arxiv

16+阅读 · 2022年8月26日

A Survey of Reinforcement Learning Techniques: Strategies, Recent Development, and Future Directions

A Survey of Reinforcement Learning Techniques: Strategies, Recent Development, and Future Directions

Arxiv

80+阅读 · 2020年1月19日

A Comprehensive Survey on Transfer Learning

A Comprehensive Survey on Transfer Learning

Arxiv

121+阅读 · 2019年11月7日

VIP会员

文章信息

相关主题

最新内容

重塑决策优势：美军作战艺术与多域作战中联盟联合全域指挥控制（CJADC2）体系的融合

重塑决策优势：美军作战艺术与多域作战中联盟联合全域指挥控制（CJADC2）体系的融合

专知会员服务

3+阅读 · 今天6:30

网状网络及其在军事领域的运用

网状网络及其在军事领域的运用

专知会员服务

4+阅读 · 今天6:18

《意识即战场——全球安全体系中认知战的演进：乌克兰构建认知作战体系的展望》

《意识即战场——全球安全体系中认知战的演进：乌克兰构建认知作战体系的展望》

专知会员服务

4+阅读 · 今天6:08

无美国参与的欧洲战争方式（万字长文）

无美国参与的欧洲战争方式（万字长文）

专知会员服务

4+阅读 · 今天5:54

重构“下一场战争”的制胜理论：超越兰彻斯特方程与现代系统

重构“下一场战争”的制胜理论：超越兰彻斯特方程与现代系统

专知会员服务

4+阅读 · 今天5:22

《国防工业中基于模型定义的实施：产品定义数字化转型的战略路径》90页

《国防工业中基于模型定义的实施：产品定义数字化转型的战略路径》90页

专知会员服务

5+阅读 · 今天5:15

《国防领域敏感性分析白皮书》

《国防领域敏感性分析白皮书》

专知会员服务

5+阅读 · 今天3:42

综述 | 从问答到任务完成：Agent系统与Harness设计

综述 | 从问答到任务完成：Agent系统与Harness设计

专知会员服务

4+阅读 · 6月24日

Agentic RL：框架、实践与长程智能体训练

Agentic RL：框架、实践与长程智能体训练

专知会员服务

3+阅读 · 6月24日

反无人机拦截器训练与运用课程：对美国陆军部队发展的启示

反无人机拦截器训练与运用课程：对美国陆军部队发展的启示

专知会员服务

9+阅读 · 6月24日

重新思考无人机时代的生存能力

重新思考无人机时代的生存能力

专知会员服务

8+阅读 · 6月24日

装甲突击旅：现代战争思考、战斗与组织

装甲突击旅：现代战争思考、战斗与组织

专知会员服务

6+阅读 · 6月24日

在人工智能加速决策环境中拓展OODA循环

在人工智能加速决策环境中拓展OODA循环

专知会员服务

8+阅读 · 6月24日

《廉价自杀式无人机战争的军事战略影响：乌克兰与伊朗案例研究》

《廉价自杀式无人机战争的军事战略影响：乌克兰与伊朗案例研究》

专知会员服务

7+阅读 · 6月24日

军事欺骗：供作战战术指挥官使用的工具

军事欺骗：供作战战术指挥官使用的工具

专知会员服务

6+阅读 · 6月24日

相关VIP内容

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

NeurIPS2021 | Cycle Self-Training：领域自适应的循环自训练方法与理论

NeurIPS2021 | Cycle Self-Training：领域自适应的循环自训练方法与理论

专知会员服务

20+阅读 · 2021年11月13日

【ICML2020】噪声在随机梯度下降中的泛化效益，On the Generalization Benefit of Noise in Stochastic Gradient Descent

【ICML2020】噪声在随机梯度下降中的泛化效益，On the Generalization Benefit of Noise in Stochastic Gradient Descent

专知会员服务

19+阅读 · 2020年6月29日

知识图谱嵌入模型的概率标定,Probability Calibration for Knowledge Graph Embedding Models

专知会员服务

36+阅读 · 2020年5月11日

【ICLR2020】用实对二进制卷积训练二进制神经网络，Training Binary Neural Networks with Real-to-Binary Convolutions

【ICLR2020】用实对二进制卷积训练二进制神经网络，Training Binary Neural Networks with Real-to-Binary Convolutions

专知会员服务

26+阅读 · 2020年3月26日

【阿里巴巴-CVPR2020】频域学习，Learning in the Frequency Domain

【阿里巴巴-CVPR2020】频域学习，Learning in the Frequency Domain

专知会员服务

29+阅读 · 2020年3月14日

【Nature交叉学科论文】机器学习在固体材料科学中的最新进展和应用，Recent advances and applications of machine learning in solidstate materials science

【Nature交叉学科论文】机器学习在固体材料科学中的最新进展和应用，Recent advances and applications of machine learning in solidstate materials science

专知会员服务

36+阅读 · 2019年12月21日

【NLP| 推荐文章】知识图谱问答系统的神经网络方法介绍（Introduction to Neural Network based Approaches for Question Answering over Knowledge Graphs）

专知会员服务

59+阅读 · 2019年11月24日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

网状网络及其在军事领域的运用

无美国参与的欧洲战争方式（万字长文）

重塑决策优势：美军作战艺术与多域作战中联盟联合全域指挥控制（CJADC2）体系的融合

《意识即战场——全球安全体系中认知战的演进：乌克兰构建认知作战体系的展望》

相关资讯

灾难性遗忘问题新视角：迁移-干扰平衡

灾难性遗忘问题新视角：迁移-干扰平衡

CreateAMind

17+阅读 · 2019年7月6日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

大数据 | 顶级SCI期刊专刊/国际会议信息7条

大数据 | 顶级SCI期刊专刊/国际会议信息7条

Call4Papers

10+阅读 · 2018年12月29日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

相关论文

A Critical Reexamination of Intra-List Distance and Dispersion

Arxiv

0+阅读 · 2023年5月23日

Adversarial Ensemble Training by Jointly Learning Label Dependencies and Member Models

Arxiv

0+阅读 · 2023年5月23日

Scaling Serverless Functions in Edge Networks: A Reinforcement Learning Approach

Arxiv

0+阅读 · 2023年5月22日

Bridging Active Exploration and Uncertainty-Aware Deployment Using Probabilistic Ensemble Neural Network Dynamics

Arxiv

0+阅读 · 2023年5月20日

A Compound Gaussian Network for Solving Linear Inverse Problems

Arxiv

0+阅读 · 2023年5月19日

Post Hoc Explanations of Language Models Can Improve Language Models

Arxiv

0+阅读 · 2023年5月19日

A Comprehensive Survey on Multimodal Recommender Systems: Taxonomy, Evaluation, and Future Directions

Arxiv

16+阅读 · 2023年2月9日

Causal Inference in Recommender Systems: A Survey and Future Directions

Arxiv

16+阅读 · 2022年8月26日

A Survey of Reinforcement Learning Techniques: Strategies, Recent Development, and Future Directions

A Survey of Reinforcement Learning Techniques: Strategies, Recent Development, and Future Directions

Arxiv

80+阅读 · 2020年1月19日

A Comprehensive Survey on Transfer Learning

A Comprehensive Survey on Transfer Learning

Arxiv

121+阅读 · 2019年11月7日

相关基金

随机动力系统的逼近和跑出问题

国家自然科学基金

0+阅读 · 2015年12月31日

面向大容量长距离波分复用系统的相位敏感光放大器研究

国家自然科学基金

0+阅读 · 2015年12月31日

面向异构环境的多任务多视图学习算法研究

国家自然科学基金

3+阅读 · 2014年12月31日

非高斯过程驱动系统的随机不变流形

国家自然科学基金

0+阅读 · 2013年12月31日

ThGM细胞与慢加急性肝衰竭免疫致病机制及疾病进展的关系研究

国家自然科学基金

0+阅读 · 2013年12月31日

非自治随机格点动力系统的渐近行为

国家自然科学基金

0+阅读 · 2013年12月31日

感觉神经元Ca2+激活氯电流的分子基础及Ca2+特异性选择机制

国家自然科学基金

0+阅读 · 2012年12月31日

Lé过程和分数阶Lé过程驱动的动力系统的动力学性质研究

国家自然科学基金

0+阅读 · 2012年12月31日

非冗余平移不变小波变换及其在医学图像处理中的应用

国家自然科学基金

0+阅读 · 2009年12月31日

Mather理论与Hamilton系统的不稳定性

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员