Stochastic regularized majorization-minimization with weakly convex and multi-convex surrogates - 专知论文

会员服务 ·

0

正则化项 · 情景 · 优化器 · 非凸 · 泛函 ·

2023 年 3 月 21 日

Stochastic regularized majorization-minimization with weakly convex and multi-convex surrogates

翻译：基于弱凸与多凸代理的随机正则化主要化-最小化方法

from arxiv, 64 pages, 5 figures, 1 table

Stochastic majorization-minimization (SMM) is a class of stochastic optimization algorithms that proceed by sampling new data points and minimizing a recursive average of surrogate functions of an objective function. The surrogates are required to be strongly convex and convergence rate analysis for the general non-convex setting was not available. In this paper, we propose an extension of SMM where surrogates are allowed to be only weakly convex or block multi-convex, and the averaged surrogates are approximately minimized with proximal regularization or block-minimized within diminishing radii, respectively. For the general nonconvex constrained setting with non-i.i.d. data samples, we show that the first-order optimality gap of the proposed algorithm decays at the rate $O((\log n)^{1+\epsilon}/n^{1/2})$ for the empirical loss and $O((\log n)^{1+\epsilon}/n^{1/4})$ for the expected loss, where $n$ denotes the number of data samples processed. Under some additional assumption, the latter convergence rate can be improved to $O((\log n)^{1+\epsilon}/n^{1/2})$. As a corollary, we obtain the first convergence rate bounds for various optimization methods under general nonconvex dependent data setting: Double-averaging projected gradient descent and its generalizations, proximal point empirical risk minimization, and online matrix/tensor decomposition algorithms. We also provide experimental validation of our results.

翻译：随机主要化-最小化（SMM）是一类随机优化算法，通过采样新数据点并最小化目标函数代理函数的递归平均值进行迭代。代理函数需满足强凸性，且此前缺乏对一般非凸场景的收敛速率分析。本文提出SMM的扩展方法，允许代理函数仅为弱凸或块多凸，并分别通过近端正则化或递减半径内的块最小化来近似最小化平均代理函数。针对非独立同分布数据采样的一般非凸约束场景，我们证明所提算法在经验损失上的最优性一阶间隙以$O((\log n)^{1+\epsilon}/n^{1/2})$速率衰减，在期望损失上以$O((\log n)^{1+\epsilon}/n^{1/4})$速率衰减（其中$n$为处理的数据样本数）。在额外假设下，后者收敛率可提升至$O((\log n)^{1+\epsilon}/n^{1/2})$。作为推论，我们首次获得一般非凸依赖数据场景下多种优化方法的收敛率界：双平均投影梯度下降及其推广、近端点经验风险最小化、在线矩阵/张量分解算法。同时，我们通过实验验证了理论结果。

0

相关内容

正则化项

【新书】机器学习凸优化，379页pdf，Convex Optimization for Machine Learning

【新书】机器学习凸优化，379页pdf，Convex Optimization for Machine Learning

专知会员服务

149+阅读 · 2022年12月18日

宾夕法尼亚大学最新《不确定性估计》课程笔记，134页pdf，附Slides

宾夕法尼亚大学最新《不确定性估计》课程笔记，134页pdf，附Slides

专知会员服务

49+阅读 · 2022年11月13日

【港科大Yunfei Yang博士论文】生成式对抗网络的分布学习:近似与泛化

【港科大Yunfei Yang博士论文】生成式对抗网络的分布学习:近似与泛化

专知会员服务

34+阅读 · 2022年5月29日

【2022新书】高效深度学习，Efficient Deep Learning Book

【2022新书】高效深度学习，Efficient Deep Learning Book

专知会员服务

128+阅读 · 2022年4月21日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

52+阅读 · 2020年12月14日

【ICML2020】噪声在随机梯度下降中的泛化效益，On the Generalization Benefit of Noise in Stochastic Gradient Descent

【ICML2020】噪声在随机梯度下降中的泛化效益，On the Generalization Benefit of Noise in Stochastic Gradient Descent

专知会员服务

19+阅读 · 2020年6月29日

【UIUC硬核书】统计学习理论，Statistical Learning Theory，213页pdf

【UIUC硬核书】统计学习理论，Statistical Learning Theory，213页pdf

专知会员服务

135+阅读 · 2020年4月14日

【NeurIPS 2019|经典论文奖】正则随机学习和在线优化的双重平均法（Dual Averaging Method for Regularized Stochastic Learning and Online Optimization），微软研究院Lin Xiao

【NeurIPS 2019|经典论文奖】正则随机学习和在线优化的双重平均法（Dual Averaging Method for Regularized Stochastic Learning and Online Optimization），微软研究院Lin Xiao

专知会员服务

17+阅读 · 2019年12月9日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

80+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

84+阅读 · 2019年10月9日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

Multi-Task Learning的几篇综述文章

Multi-Task Learning的几篇综述文章

深度学习自然语言处理

15+阅读 · 2020年6月15日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

【泡泡一分钟】基于运动估计的激光雷达和相机标定方法

【泡泡一分钟】基于运动估计的激光雷达和相机标定方法

泡泡机器人SLAM

25+阅读 · 2019年1月17日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

【论文推荐】最新5篇图像分割相关论文—条件随机场和深度特征学习、移动端网络、长期视觉定位、主动学习、主动轮廓模型、生成对抗性网络

【论文推荐】最新5篇图像分割相关论文—条件随机场和深度特征学习、移动端网络、长期视觉定位、主动学习、主动轮廓模型、生成对抗性网络

专知

13+阅读 · 2018年1月23日

两类带导数的非线性Schrodinger方程拟周期解的存在性

国家自然科学基金

0+阅读 · 2015年12月31日

基于混合约束正则化的电阻抗成像反演研究

国家自然科学基金

0+阅读 · 2015年12月31日

随机波动率模型的统计推断及数值解

国家自然科学基金

1+阅读 · 2015年12月31日

带粗糙系数的高阶微分算子的若干研究

国家自然科学基金

0+阅读 · 2013年12月31日

约束Lp正则化问题算法及应用

国家自然科学基金

0+阅读 · 2012年12月31日

无穷维动力系统的随机小扰动

国家自然科学基金

0+阅读 · 2012年12月31日

随机偏微分方程快速高精度算法

国家自然科学基金

0+阅读 · 2012年12月31日

热传导方程的时间最优控制与范数最优控制

国家自然科学基金

0+阅读 · 2011年12月31日

遍历哈密顿系统的谱理论

国家自然科学基金

0+阅读 · 2009年12月31日

随机微分方程的逼近

国家自然科学基金

0+阅读 · 2009年12月31日

Convergence of Alternating Gradient Descent for Matrix Factorization

Arxiv

0+阅读 · 2023年5月11日

Stochastic Variance-Reduced Majorization-Minimization Algorithms

Arxiv

0+阅读 · 2023年5月11日

Two new algorithms for maximum likelihood estimation of sparse covariance matrices with applications to graphical modeling

Arxiv

0+阅读 · 2023年5月11日

Active Learning in the Predict-then-Optimize Framework: A Margin-Based Approach

Arxiv

0+阅读 · 2023年5月11日

Optimally-Weighted Estimators of the Maximum Mean Discrepancy for Likelihood-Free Inference

Arxiv

0+阅读 · 2023年5月10日

Convergence of a Normal Map-based Prox-SGD Method under the KL Inequality

Arxiv

0+阅读 · 2023年5月10日

'Put the Car on the Stand': SMT-based Oracles for Investigating Decisions

Arxiv

0+阅读 · 2023年5月9日

UAdam: Unified Adam-Type Algorithmic Framework for Non-Convex Stochastic Optimization

Arxiv

0+阅读 · 2023年5月9日

Random Algebraic Graphs and Their Convergence to Erdos-Renyi

Arxiv

0+阅读 · 2023年5月9日

A Survey on Multi-Task Learning

Arxiv

32+阅读 · 2021年3月29日

VIP会员

文章信息

相关主题

最新内容

《基于智能体建模与仿真的无人机蜂群模型目标定位涌现行为比较分析》360页

《基于智能体建模与仿真的无人机蜂群模型目标定位涌现行为比较分析》360页

专知会员服务

7+阅读 · 7月18日

欧洲智能弹药战略创新管理：迈向制导弹药、巡飞系统与自主无人机蜂群的技术主权研究路线图

欧洲智能弹药战略创新管理：迈向制导弹药、巡飞系统与自主无人机蜂群的技术主权研究路线图

专知会员服务

5+阅读 · 7月18日

从领域适配到部署与可解释：Berkeley博士论文解析大语言模型真实落地

从领域适配到部署与可解释：Berkeley博士论文解析大语言模型真实落地

专知会员服务

6+阅读 · 7月18日

综述 | 长程智能体研究全景：基础、演化、框架、优化与前沿

综述 | 长程智能体研究全景：基础、演化、框架、优化与前沿

专知会员服务

4+阅读 · 7月18日

DARPA拟打造十万规模自主思考作战的AI智能体集群：“受控涌现式分布式人工智能”（DICE）项目

DARPA拟打造十万规模自主思考作战的AI智能体集群：“受控涌现式分布式人工智能”（DICE）项目

专知会员服务

8+阅读 · 7月17日

《边缘端实时无线感知赋能现场多机器人部署》200页

《边缘端实时无线感知赋能现场多机器人部署》200页

专知会员服务

7+阅读 · 7月17日

战力倍增器：自主武器系统与乌克兰及加沙冲突

战力倍增器：自主武器系统与乌克兰及加沙冲突

专知会员服务

4+阅读 · 7月17日

人工智能赋能战场情报：提速决策进程

人工智能赋能战场情报：提速决策进程

专知会员服务

2+阅读 · 7月17日

《拥抱新兴技术：面向未来军官的教育革新》

《拥抱新兴技术：面向未来军官的教育革新》

专知会员服务

5+阅读 · 7月17日

ACM MM 2026 | MAR-GRPO：稳定混合图像生成的强化学习训练

ACM MM 2026 | MAR-GRPO：稳定混合图像生成的强化学习训练

专知会员服务

3+阅读 · 7月17日

综述 | 大模型水印理论与部署：来源追踪、攻击鲁棒与可信治理

综述 | 大模型水印理论与部署：来源追踪、攻击鲁棒与可信治理

专知会员服务

4+阅读 · 7月17日

《火线上的后勤保障：对抗环境下的随机规划模型研究——俄乌场景案例分析》99页

《火线上的后勤保障：对抗环境下的随机规划模型研究——俄乌场景案例分析》99页

专知会员服务

12+阅读 · 7月16日

《无人地面战车（UGV）的崛起》报告

《无人地面战车（UGV）的崛起》报告

专知会员服务

7+阅读 · 7月16日

《无人机参数化与集群飞行创新项目的监控流程管理：模型、策略及自适应解决方案》

《无人机参数化与集群飞行创新项目的监控流程管理：模型、策略及自适应解决方案》

专知会员服务

6+阅读 · 7月16日

《美军开放式任务系统（OMS）定义与文档（D&D）——Java关键抽象层（CAL）接口生成规范》47页标准

《美军开放式任务系统（OMS）定义与文档（D&D）——Java关键抽象层（CAL）接口生成规范》47页标准

专知会员服务

14+阅读 · 7月16日

相关VIP内容

【新书】机器学习凸优化，379页pdf，Convex Optimization for Machine Learning

【新书】机器学习凸优化，379页pdf，Convex Optimization for Machine Learning

专知会员服务

149+阅读 · 2022年12月18日

宾夕法尼亚大学最新《不确定性估计》课程笔记，134页pdf，附Slides

宾夕法尼亚大学最新《不确定性估计》课程笔记，134页pdf，附Slides

专知会员服务

49+阅读 · 2022年11月13日

【港科大Yunfei Yang博士论文】生成式对抗网络的分布学习:近似与泛化

【港科大Yunfei Yang博士论文】生成式对抗网络的分布学习:近似与泛化

专知会员服务

34+阅读 · 2022年5月29日

【2022新书】高效深度学习，Efficient Deep Learning Book

【2022新书】高效深度学习，Efficient Deep Learning Book

专知会员服务

128+阅读 · 2022年4月21日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

52+阅读 · 2020年12月14日

【ICML2020】噪声在随机梯度下降中的泛化效益，On the Generalization Benefit of Noise in Stochastic Gradient Descent

【ICML2020】噪声在随机梯度下降中的泛化效益，On the Generalization Benefit of Noise in Stochastic Gradient Descent

专知会员服务

19+阅读 · 2020年6月29日

【UIUC硬核书】统计学习理论，Statistical Learning Theory，213页pdf

【UIUC硬核书】统计学习理论，Statistical Learning Theory，213页pdf

专知会员服务

135+阅读 · 2020年4月14日

【NeurIPS 2019|经典论文奖】正则随机学习和在线优化的双重平均法（Dual Averaging Method for Regularized Stochastic Learning and Online Optimization），微软研究院Lin Xiao

【NeurIPS 2019|经典论文奖】正则随机学习和在线优化的双重平均法（Dual Averaging Method for Regularized Stochastic Learning and Online Optimization），微软研究院Lin Xiao

专知会员服务

17+阅读 · 2019年12月9日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

80+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

84+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

欧洲智能弹药战略创新管理：迈向制导弹药、巡飞系统与自主无人机蜂群的技术主权研究路线图

综述 | 长程智能体研究全景：基础、演化、框架、优化与前沿

《基于智能体建模与仿真的无人机蜂群模型目标定位涌现行为比较分析》360页

从领域适配到部署与可解释：Berkeley博士论文解析大语言模型真实落地

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

Multi-Task Learning的几篇综述文章

Multi-Task Learning的几篇综述文章

深度学习自然语言处理

15+阅读 · 2020年6月15日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

【泡泡一分钟】基于运动估计的激光雷达和相机标定方法

【泡泡一分钟】基于运动估计的激光雷达和相机标定方法

泡泡机器人SLAM

25+阅读 · 2019年1月17日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

【论文推荐】最新5篇图像分割相关论文—条件随机场和深度特征学习、移动端网络、长期视觉定位、主动学习、主动轮廓模型、生成对抗性网络

【论文推荐】最新5篇图像分割相关论文—条件随机场和深度特征学习、移动端网络、长期视觉定位、主动学习、主动轮廓模型、生成对抗性网络

专知

13+阅读 · 2018年1月23日

相关论文

Convergence of Alternating Gradient Descent for Matrix Factorization

Arxiv

0+阅读 · 2023年5月11日

Stochastic Variance-Reduced Majorization-Minimization Algorithms

Arxiv

0+阅读 · 2023年5月11日

Two new algorithms for maximum likelihood estimation of sparse covariance matrices with applications to graphical modeling

Arxiv

0+阅读 · 2023年5月11日

Active Learning in the Predict-then-Optimize Framework: A Margin-Based Approach

Arxiv

0+阅读 · 2023年5月11日

Optimally-Weighted Estimators of the Maximum Mean Discrepancy for Likelihood-Free Inference

Arxiv

0+阅读 · 2023年5月10日

Convergence of a Normal Map-based Prox-SGD Method under the KL Inequality

Arxiv

0+阅读 · 2023年5月10日

'Put the Car on the Stand': SMT-based Oracles for Investigating Decisions

Arxiv

0+阅读 · 2023年5月9日

UAdam: Unified Adam-Type Algorithmic Framework for Non-Convex Stochastic Optimization

Arxiv

0+阅读 · 2023年5月9日

Random Algebraic Graphs and Their Convergence to Erdos-Renyi

Arxiv

0+阅读 · 2023年5月9日

A Survey on Multi-Task Learning

Arxiv

32+阅读 · 2021年3月29日

相关基金

两类带导数的非线性Schrodinger方程拟周期解的存在性

国家自然科学基金

0+阅读 · 2015年12月31日

基于混合约束正则化的电阻抗成像反演研究

国家自然科学基金

0+阅读 · 2015年12月31日

随机波动率模型的统计推断及数值解

国家自然科学基金

1+阅读 · 2015年12月31日

带粗糙系数的高阶微分算子的若干研究

国家自然科学基金

0+阅读 · 2013年12月31日

约束Lp正则化问题算法及应用

国家自然科学基金

0+阅读 · 2012年12月31日

无穷维动力系统的随机小扰动

国家自然科学基金

0+阅读 · 2012年12月31日

随机偏微分方程快速高精度算法

国家自然科学基金

0+阅读 · 2012年12月31日

热传导方程的时间最优控制与范数最优控制

国家自然科学基金

0+阅读 · 2011年12月31日

遍历哈密顿系统的谱理论

国家自然科学基金

0+阅读 · 2009年12月31日

随机微分方程的逼近

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员