We study unconstrained Online Linear Optimization with Lipschitz losses. Motivated by the pursuit of instance optimality, we propose a new algorithm that simultaneously achieves ($i$) the AdaGrad-style second order gradient adaptivity; and ($ii$) the comparator norm adaptivity also known as "parameter freeness" in the literature. In particular, - our algorithm does not employ the impractical doubling trick, and does not require an a priori estimate of the time-uniform Lipschitz constant; - the associated regret bound has the optimal $O(\sqrt{V_T})$ dependence on the gradient variance $V_T$, without the typical logarithmic multiplicative factor; - the leading constant in the regret bound is "almost" optimal. Central to these results is a continuous time approach to online learning. We first show that the aimed simultaneous adaptivity can be achieved fairly easily in a continuous time analogue of the problem, where the environment is modeled by an arbitrary continuous semimartingale. Then, our key innovation is a new discretization argument that preserves such adaptivity in the discrete time adversarial setting. This refines a non-gradient-adaptive discretization argument from (Harvey et al., 2023), both algorithmically and analytically, which could be of independent interest.
翻译:我们研究具有Lipschitz损失的無约束在线线性优化。受实例最优性追求的启发,我们提出一种新算法,该算法同时实现:(i) AdaGrad风格的二阶梯度自适应性;(ii) 比较器范数自适应性(文献中亦称"无参数依赖性")。特别地:- 我们的算法不采用不切实际的加倍技巧,且无需预先估计时间一致Lipschitz常数;- 相应的遗憾界具有关于梯度方差$V_T$的最优$O(\sqrt{V_T})$依赖性,且无需典型对数乘法因子;- 遗憾界中的主导常数"几乎"最优。这些结果的核心在于在线学习的连续时间方法。我们首先证明,在问题的连续时间类比中(其中环境由任意连续半鞅建模),所需的同时自适应性可以相当容易地实现。随后,我们的关键创新在于一种新的离散化论证,使得这种自适应性在离散时间对抗性设置中得以保持。该论证从(Harvey等人, 2023)的非梯度自适应离散化论证出发,在算法与分析层面均进行了改进,这一改进本身可能具有独立的研究价值。