The Frank-Wolfe (FW) method is a popular approach for solving optimization problems with structured constraints that arise in machine learning applications. In recent years, stochastic versions of FW have gained popularity, motivated by large datasets for which the computation of the full gradient is prohibitively expensive. In this paper, we present two new variants of the FW algorithms for stochastic finite-sum minimization. Our algorithms have the best convergence guarantees of existing stochastic FW approaches for both convex and non-convex objective functions. Our methods do not have the issue of permanently collecting large batches, which is common to many stochastic projection-free approaches. Moreover, our second approach does not require either large batches or full deterministic gradients, which is a typical weakness of many techniques for finite-sum problems. The faster theoretical rates of our approaches are confirmed experimentally.
翻译:Frank-Wolfe (FW) 方法是求解机器学习应用中具有结构化约束的优化问题的流行方法。近年来,受大规模数据集(其中计算全梯度成本过高)驱动,随机版本的FW方法日益受到关注。本文针对随机有限和最小化问题,提出了两种新的FW算法变体。在凸目标函数与非凸目标函数上,我们的算法均具备现有随机FW方法中最优的收敛保证。与许多随机投影自由方法不同,我们的方法无需永久性采集大批量数据。此外,第二种方法既不需要大批量数据,也不依赖于全确定性梯度——而这一缺陷正是许多有限和问题求解技术的典型弱点。实验结果证实了理论速率优越性。