The Frank-Wolfe algorithm is a popular method in structurally constrained machine learning applications, due to its fast per-iteration complexity. However, one major limitation of the method is a slow rate of convergence that is difficult to accelerate due to erratic, zig-zagging step directions, even asymptotically close to the solution. We view this as an artifact of discretization; that is to say, the Frank-Wolfe \emph{flow}, which is its trajectory at asymptotically small step sizes, does not zig-zag, and reducing discretization error will go hand-in-hand in producing a more stabilized method, with better convergence properties. We propose two improvements: a multistep Frank-Wolfe method that directly applies optimized higher-order discretization schemes; and an LMO-averaging scheme with reduced discretization error, and whose local convergence rate over general convex sets accelerates from a rate of $O(1/k)$ to up to $O(1/k^{3/2})$.
翻译:Frank-Wolfe算法因其每次迭代计算复杂度低而成为结构约束型机器学习应用中的流行方法。然而,该方法的主要局限在于收敛速度缓慢,且由于步进方向呈不规则之字形(即使在渐近接近解的区域也是如此),该速度难以加速。我们将此现象视为离散化误差的产物:即Frank-Wolfe流(在渐近小步长下的轨迹)并不呈现之字形,因此减小离散化误差将有助于获得更稳定的方法及更好的收敛性质。我们提出两项改进:一是直接应用优化高阶离散化方案的多步Frank-Wolfe方法;二是基于LMO平均化的方案,该方案可降低离散化误差,并在一般凸集上将局部收敛率从$O(1/k)$加速至$O(1/k^{3/2})$。