Rethinking Initialization of the Sinkhorn Algorithm - 专知论文

会员服务 ·

0

初始化 · 正则化 · 算法 · 雅克比 · 输运 ·

2023 年 4 月 5 日

Rethinking Initialization of the Sinkhorn Algorithm

翻译：反思Sinkhorn算法的初始化方法

James Thornton,Marco Cuturi

While the optimal transport (OT) problem was originally formulated as a linear program, the addition of entropic regularization has proven beneficial both computationally and statistically, for many applications. The Sinkhorn fixed-point algorithm is the most popular approach to solve this regularized problem, and, as a result, multiple attempts have been made to reduce its runtime using, e.g., annealing in the regularization parameter, momentum or acceleration. The premise of this work is that initialization of the Sinkhorn algorithm has received comparatively little attention, possibly due to two preconceptions: since the regularized OT problem is convex, it may not be worth crafting a good initialization, since any is guaranteed to work; secondly, because the outputs of the Sinkhorn algorithm are often unrolled in end-to-end pipelines, a data-dependent initialization would bias Jacobian computations. We challenge this conventional wisdom, and show that data-dependent initializers result in dramatic speed-ups, with no effect on differentiability as long as implicit differentiation is used. Our initializations rely on closed-forms for exact or approximate OT solutions that are known in the 1D, Gaussian or GMM settings. They can be used with minimal tuning, and result in consistent speed-ups for a wide variety of OT problems.

翻译：虽然最优传输（OT）问题最初被表述为线性规划问题，但加入熵正则化在许多应用中已被证明在计算和统计上均有益处。Sinkhorn不动点算法是求解这一正则化问题最常用的方法，因此，已有多种尝试通过退火正则化参数、动量或加速等方法来减少其运行时间。本文的研究前提是，Sinkhorn算法的初始化问题受到的关注相对较少，这可能源于两种先入为主的观念：由于正则化OT问题是凸的，精心设计一个好的初始化可能并不值得，因为任何初始化都能保证收敛；其次，由于Sinkhorn算法的输出通常在端到端流水线中被展开，数据依赖的初始化会偏置雅可比矩阵的计算。我们挑战这一传统观点，并证明数据依赖的初始化能显著加速计算，只要使用隐式微分，就不会影响可微性。我们的初始化方法依赖于1维、高斯或高斯混合模型（GMM）设置下已知的精确或近似OT解的闭式表达式。它们可以以最少的调参使用，并在广泛的各种OT问题中实现一致的加速效果。

0

相关内容

初始化

【2022新书】高效深度学习，Efficient Deep Learning Book

【2022新书】高效深度学习，Efficient Deep Learning Book

专知会员服务

128+阅读 · 2022年4月21日

【ICML2021】低秩Sinkhorn 分解

专知会员服务

39+阅读 · 2021年8月20日

WWW21最新「比较学习」教程，135页PPT阐述从排名数据中学习

专知会员服务

37+阅读 · 2021年4月27日

机器学习组合优化

机器学习组合优化

专知会员服务

112+阅读 · 2021年2月16日

【ETH】最新《几何数据分析》2020课程，附PPT下载

专知会员服务

45+阅读 · 2020年12月18日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

47+阅读 · 2020年10月31日

【UMD开放书】机器学习课程书册，19章227页pdf，带你学习ML

【UMD开放书】机器学习课程书册，19章227页pdf，带你学习ML

专知会员服务

103+阅读 · 2019年12月9日

开源书：PyTorch深度学习起步

开源书：PyTorch深度学习起步

专知会员服务

51+阅读 · 2019年10月11日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

浅聊对比学习（Contrastive Learning）

浅聊对比学习（Contrastive Learning）

极市平台

3+阅读 · 2022年7月26日

浅聊对比学习（Contrastive Learning）第一弹

浅聊对比学习（Contrastive Learning）第一弹

PaperWeekly

1+阅读 · 2022年6月10日

Multi-Task Learning的几篇综述文章

Multi-Task Learning的几篇综述文章

深度学习自然语言处理

15+阅读 · 2020年6月15日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

【推荐】用Python/OpenCV实现增强现实

【推荐】用Python/OpenCV实现增强现实

机器学习研究会

15+阅读 · 2017年11月16日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【推荐】MXNet深度情感分析实战

【推荐】MXNet深度情感分析实战

机器学习研究会

16+阅读 · 2017年10月4日

基于双基系统的椭圆曲线标量乘算法研究

国家自然科学基金

0+阅读 · 2015年12月31日

非局部总变差正则化图像恢复模型的快速子空间校正算法

国家自然科学基金

0+阅读 · 2014年12月31日

带稀疏约束不适定问题的算法研究

国家自然科学基金

0+阅读 · 2014年12月31日

适定的多元样条逼近方法研究

国家自然科学基金

0+阅读 · 2014年12月31日

多任务学习的理论分析与应用

国家自然科学基金

6+阅读 · 2013年12月31日

基于似然估计的梯度优化在变量带误差模型辨识中的收敛性分析

国家自然科学基金

0+阅读 · 2013年12月31日

几类无线通信中的非凸矩阵优化问题及算法研究

国家自然科学基金

1+阅读 · 2013年12月31日

改进Max-SAT算法的关键技术研究

国家自然科学基金

0+阅读 · 2009年12月31日

基于小波变换的仿射不变形状表示算法研究

国家自然科学基金

0+阅读 · 2009年12月31日

随机微分方程的逼近

国家自然科学基金

0+阅读 · 2009年12月31日

The First Proven Performance Guarantees for the Non-Dominated Sorting Genetic Algorithm II (NSGA-II) on a Combinatorial Optimization Problem

Arxiv

0+阅读 · 2023年5月22日

Error estimates of a theta-scheme for second-order mean field games

Arxiv

0+阅读 · 2023年5月22日

A two-way heterogeneity model for dynamic networks

Arxiv

0+阅读 · 2023年5月22日

A Novel Framework for Improving the Breakdown Point of Robust Regression Algorithms

Arxiv

0+阅读 · 2023年5月20日

On the approximability and energy-flow modeling of the electric vehicle sharing problem

Arxiv

0+阅读 · 2023年5月20日

On the Relationship between Markov Switching Models and Fuzzy Clustering: a Nonparametric Method to Detect the Number of States

Arxiv

0+阅读 · 2023年5月20日

Rankability and Linear Ordering Problem: New Probabilistic Insight and Algorithms

Arxiv

0+阅读 · 2023年5月19日

The Mori-Zwanzig formulation of deep learning

Arxiv

0+阅读 · 2023年5月19日

The Principles of Deep Learning Theory

Arxiv

66+阅读 · 2021年6月18日

Train Large, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers

Arxiv

12+阅读 · 2020年6月23日

VIP会员

文章信息

相关主题

最新内容

论文解读 | 医学图像修复中的扩散模型：挑战、分类与未来方向

论文解读 | 医学图像修复中的扩散模型：挑战、分类与未来方向

专知会员服务

0+阅读 · 7月28日

博士论文 | 从算法到基础模型：强化学习的统一视角

博士论文 | 从算法到基础模型：强化学习的统一视角

专知会员服务

0+阅读 · 7月28日

面向国防作战的最佳自主与蜂群无人机技术

面向国防作战的最佳自主与蜂群无人机技术

专知会员服务

5+阅读 · 7月28日

《异构人类团队的协作决策过程混合建模研究》

《异构人类团队的协作决策过程混合建模研究》

专知会员服务

4+阅读 · 7月28日

《C5ISR系统中的注意力动态与自适应决策支持研究：视觉与多模态注意力引导对任务绩效影响的递归量化分析》最新36页报告

《C5ISR系统中的注意力动态与自适应决策支持研究：视觉与多模态注意力引导对任务绩效影响的递归量化分析》最新36页报告

专知会员服务

4+阅读 · 7月28日

《设计思维中的人机协作：生成式人工智能对共情访谈影响的探究》140页

《设计思维中的人机协作：生成式人工智能对共情访谈影响的探究》140页

专知会员服务

4+阅读 · 7月28日

博士论文 | 面向大模型推理的内存高效算法

博士论文 | 面向大模型推理的内存高效算法

专知会员服务

5+阅读 · 7月27日

论文解读 | 从预训练到后训练：理解大模型推理能力如何形成

论文解读 | 从预训练到后训练：理解大模型推理能力如何形成

专知会员服务

7+阅读 · 7月27日

《无人系统互操作性导论——无人系统联合架构（JAUS）》

《无人系统互操作性导论——无人系统联合架构（JAUS）》

专知会员服务

13+阅读 · 7月27日

美空军新型反无人机部队初探

美空军新型反无人机部队初探

专知会员服务

8+阅读 · 7月27日

《对抗性电磁环境下远程巡飞弹作战的安全指挥与控制数据链》

《对抗性电磁环境下远程巡飞弹作战的安全指挥与控制数据链》

专知会员服务

7+阅读 · 7月27日

《北约下一代建模与仿真（NexGen M&S）计划》2026年69页

《北约下一代建模与仿真（NexGen M&S）计划》2026年69页

专知会员服务

5+阅读 · 7月27日

《防空交战流程的概率建模研究》

《防空交战流程的概率建模研究》

专知会员服务

12+阅读 · 7月27日

ICML 2026 教程 | 数值优化理论还重要吗？

ICML 2026 教程 | 数值优化理论还重要吗？

专知会员服务

7+阅读 · 7月26日

ICM 2026 | 陶哲轩：人工智能时代的数学

ICM 2026 | 陶哲轩：人工智能时代的数学

专知会员服务

10+阅读 · 7月26日

相关VIP内容

【2022新书】高效深度学习，Efficient Deep Learning Book

【2022新书】高效深度学习，Efficient Deep Learning Book

专知会员服务

128+阅读 · 2022年4月21日

【ICML2021】低秩Sinkhorn 分解

专知会员服务

39+阅读 · 2021年8月20日

WWW21最新「比较学习」教程，135页PPT阐述从排名数据中学习

专知会员服务

37+阅读 · 2021年4月27日

机器学习组合优化

机器学习组合优化

专知会员服务

112+阅读 · 2021年2月16日

【ETH】最新《几何数据分析》2020课程，附PPT下载

专知会员服务

45+阅读 · 2020年12月18日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

47+阅读 · 2020年10月31日

【UMD开放书】机器学习课程书册，19章227页pdf，带你学习ML

【UMD开放书】机器学习课程书册，19章227页pdf，带你学习ML

专知会员服务

103+阅读 · 2019年12月9日

开源书：PyTorch深度学习起步

开源书：PyTorch深度学习起步

专知会员服务

51+阅读 · 2019年10月11日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

博士论文 | 从算法到基础模型：强化学习的统一视角

《异构人类团队的协作决策过程混合建模研究》

论文解读 | 医学图像修复中的扩散模型：挑战、分类与未来方向

面向国防作战的最佳自主与蜂群无人机技术

相关资讯

浅聊对比学习（Contrastive Learning）

浅聊对比学习（Contrastive Learning）

极市平台

3+阅读 · 2022年7月26日

浅聊对比学习（Contrastive Learning）第一弹

浅聊对比学习（Contrastive Learning）第一弹

PaperWeekly

1+阅读 · 2022年6月10日

Multi-Task Learning的几篇综述文章

Multi-Task Learning的几篇综述文章

深度学习自然语言处理

15+阅读 · 2020年6月15日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

【推荐】用Python/OpenCV实现增强现实

【推荐】用Python/OpenCV实现增强现实

机器学习研究会

15+阅读 · 2017年11月16日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【推荐】MXNet深度情感分析实战

【推荐】MXNet深度情感分析实战

机器学习研究会

16+阅读 · 2017年10月4日

相关论文

The First Proven Performance Guarantees for the Non-Dominated Sorting Genetic Algorithm II (NSGA-II) on a Combinatorial Optimization Problem

Arxiv

0+阅读 · 2023年5月22日

Error estimates of a theta-scheme for second-order mean field games

Arxiv

0+阅读 · 2023年5月22日

A two-way heterogeneity model for dynamic networks

Arxiv

0+阅读 · 2023年5月22日

A Novel Framework for Improving the Breakdown Point of Robust Regression Algorithms

Arxiv

0+阅读 · 2023年5月20日

On the approximability and energy-flow modeling of the electric vehicle sharing problem

Arxiv

0+阅读 · 2023年5月20日

On the Relationship between Markov Switching Models and Fuzzy Clustering: a Nonparametric Method to Detect the Number of States

Arxiv

0+阅读 · 2023年5月20日

Rankability and Linear Ordering Problem: New Probabilistic Insight and Algorithms

Arxiv

0+阅读 · 2023年5月19日

The Mori-Zwanzig formulation of deep learning

Arxiv

0+阅读 · 2023年5月19日

The Principles of Deep Learning Theory

Arxiv

66+阅读 · 2021年6月18日

Train Large, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers

Arxiv

12+阅读 · 2020年6月23日

相关基金

基于双基系统的椭圆曲线标量乘算法研究

国家自然科学基金

0+阅读 · 2015年12月31日

非局部总变差正则化图像恢复模型的快速子空间校正算法

国家自然科学基金

0+阅读 · 2014年12月31日

带稀疏约束不适定问题的算法研究

国家自然科学基金

0+阅读 · 2014年12月31日

适定的多元样条逼近方法研究

国家自然科学基金

0+阅读 · 2014年12月31日

多任务学习的理论分析与应用

国家自然科学基金

6+阅读 · 2013年12月31日

基于似然估计的梯度优化在变量带误差模型辨识中的收敛性分析

国家自然科学基金

0+阅读 · 2013年12月31日

几类无线通信中的非凸矩阵优化问题及算法研究

国家自然科学基金

1+阅读 · 2013年12月31日

改进Max-SAT算法的关键技术研究

国家自然科学基金

0+阅读 · 2009年12月31日

基于小波变换的仿射不变形状表示算法研究

国家自然科学基金

0+阅读 · 2009年12月31日

随机微分方程的逼近

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员