Optimization algorithms such as projected Newton's method, FISTA, mirror descent, and its variants enjoy near-optimal regret bounds and convergence rates, but suffer from a computational bottleneck of computing ``projections'' in potentially each iteration (e.g., $O(T^{1/2})$ regret of online mirror descent). On the other hand, conditional gradient variants solve a linear optimization in each iteration, but result in suboptimal rates (e.g., $O(T^{3/4})$ regret of online Frank-Wolfe). Motivated by this trade-off in runtime v/s convergence rates, we consider iterative projections of close-by points over widely-prevalent submodular base polytopes $B(f)$. We first give necessary and sufficient conditions for when two close points project to the same face of a polytope, and then show that points far away from the polytope project onto its vertices with high probability. We next use this theory and develop a toolkit to speed up the computation of iterative projections over submodular polytopes using both discrete and continuous perspectives. We subsequently adapt the away-step Frank-Wolfe algorithm to use this information and enable early termination. For the special case of cardinality-based submodular polytopes, we improve the runtime of computing certain Bregman projections by a factor of $\Omega(n/\log(n))$. Our theoretical results show orders of magnitude reduction in runtime in preliminary computational experiments.
翻译:诸如投影牛顿法、FISTA、镜像下降及其变体等优化算法在理论上具有接近最优的遗憾界与收敛速率,但每次迭代中计算“投影”操作(例如在线镜像下降的$O(T^{1/2})$遗憾)构成计算瓶颈。另一方面,条件梯度类方法每步仅需解决线性优化问题,却导致次优收敛速率(如在线Frank-Wolfe算法的$O(T^{3/4})$遗憾)。针对这种运行时间与收敛速率之间的权衡,本文考虑在广泛存在的子模基多面体$B(f)$上对邻近点进行迭代投影。我们首先给出两个邻近点投影到多面体同一面的充要条件,进而证明远离多面体的点将以高概率投影到其顶点上。基于该理论,我们开发了从离散与连续两个视角加速子模多面体上迭代投影计算的工具包。随后改进away-step Frank-Wolfe算法以利用该信息实现提前终止。对于基数型子模多面体的特例,我们将特定Bregman投影的计算复杂度降低了$\Omega(n/\log(n))$倍。初步计算实验表明,我们的理论成果可实现运行时间数量级的缩减。