A new algorithm for regret minimization in online convex optimization is described. The regret of the algorithm after $T$ time periods is $O(\sqrt{T \log T})$ - which is the minimum possible up to a logarithmic term. In addition, the new algorithm is adaptive, in the sense that the regret bounds hold not only for the time periods $1,\ldots,T$ but also for every sub-interval $s,s+1,\ldots,t$. The running time of the algorithm matches that of newly introduced interior point algorithms for regret minimization: in $n$-dimensional space, during each iteration the new algorithm essentially solves a system of linear equations of order $n$, rather than solving some constrained convex optimization problem in $n$ dimensions and possibly many constraints.
翻译:本文描述了一种用于在线凸优化中后悔值最小化的新算法。该算法在$T$个时间段后的后悔值为$O(\sqrt{T \log T})$,这是除对数项外可能达到的最小值。此外,新算法具有自适应性,即后悔值上界不仅对时间段$1,\ldots,T$成立,而且对每个子区间$s,s+1,\ldots,t$也成立。该算法的运行时间与最新引入的用于后悔值最小化的内点算法相匹配:在$n$维空间中,每次迭代时新算法本质上只需求解一个$n$阶线性方程组,而非求解可能含多个约束的$n$维约束凸优化问题。