We show that any randomized first-order algorithm which minimizes a $d$-dimensional, $1$-Lipschitz convex function over the unit ball must either use $\Omega(d^{2-\delta})$ bits of memory or make $\Omega(d^{1+\delta/6-o(1)})$ queries, for any constant $\delta\in (0,1)$ and when the precision $\epsilon$ is quasipolynomially small in $d$. Our result implies that cutting plane methods, which use $\tilde{O}(d^2)$ bits of memory and $\tilde{O}(d)$ queries, are Pareto-optimal among randomized first-order algorithms, and quadratic memory is required to achieve optimal query complexity for convex optimization.
翻译:我们证明,任何最小化单位球上$d$维、$1$-Lipschitz凸函数的随机一阶算法,对于任意常数$\delta\in (0,1)$且当精度$\epsilon$在$d$中呈拟多项式小时,要么必须使用$\Omega(d^{2-\delta})$比特的内存,要么进行$\Omega(d^{1+\delta/6-o(1)})$次查询。我们的结果表明,使用$\tilde{O}(d^2)$比特内存和$\tilde{O}(d)$次查询的切割平面方法在随机一阶算法中是帕累托最优的,并且为了实现凸优化的最优查询复杂度,需要二次内存。