We provide an online learning algorithm that obtains regret $G\|w_\star\|\sqrt{T\log(\|w_\star\|G\sqrt{T})} + \|w_\star\|^2 + G^2$ on $G$-Lipschitz convex losses for any comparison point $w_\star$ without knowing either $G$ or $\|w_\star\|$. Importantly, this matches the optimal bound $G\|w_\star\|\sqrt{T}$ available with such knowledge (up to logarithmic factors), unless either $\|w_\star\|$ or $G$ is so large that even $G\|w_\star\|\sqrt{T}$ is roughly linear in $T$. Thus, it matches the optimal bound in all cases in which one can achieve sublinear regret, which arguably most "interesting" scenarios.
翻译:我们提出一种在线学习算法,该算法在$G$-Lipschitz凸损失上对于任意比较点$w_\star$,在未知$G$或$\|w_\star\|$的情况下,获得遗憾上界$G\|w_\star\|\sqrt{T\log(\|w_\star\|G\sqrt{T})} + \|w_\star\|^2 + G^2$。重要的是,该界与已知这些信息时可获得的最优界$G\|w_\star\|\sqrt{T}$相匹配(至多相差对数因子),除非$\|w_\star\|$或$G$过大,以至于即使$G\|w_\star\|\sqrt{T}$也近似线性于$T$。因此,在所有能够实现次线性遗憾的情形中——这 arguably 涵盖了大多数“有意义”的场景——该算法均与最优界相匹配。