Adversarial online linear optimization (OLO) is essentially about making performance tradeoffs with respect to the unknown difficulty of the adversary. In the setting of one-dimensional fixed-time OLO on a bounded domain, it has been observed since Cover (1966) that achievable tradeoffs are governed by probabilistic inequalities, and these descriptive results can be converted into algorithms via dynamic programming, which, however, is not computationally efficient. We address this limitation by showing that Stein's method, a classical framework underlying the proofs of probabilistic limit theorems, can be operationalized as computationally efficient OLO algorithms. The associated regret and total loss upper bounds are "additively sharp", meaning that they surpass the conventional big-O optimality and match normal-approximation-based lower bounds by additive lower order terms. Our construction is inspired by the remarkably clean proof of a Wasserstein martingale central limit theorem (CLT) due to Röllin (2018). Several concrete benefits can be obtained from this general technique. First, with the same computational complexity, the proposed algorithm improves upon the total loss upper bounds of online gradient descent (OGD) and multiplicative weight update (MWU). Second, our algorithm can realize a continuum of optimal two-point tradeoffs between the total loss and the maximum regret over comparators, improving upon prior works in parameter-free online learning. Third, by allowing the adversary to randomize on an unbounded support, we achieve sharp in-expectation performance guarantees for OLO with noisy feedback.
翻译:对抗性在线线性优化本质上是在未知对手难度下进行性能权衡。在有界域上的一维固定时间在线线性优化设置中,自Cover(1966)以来已观察到可实现的权衡由概率不等式支配,这些描述性结果可通过动态规划转化为算法,但计算效率低下。我们通过证明Stein方法——一个支撑概率极限定理证明的经典框架——可被操作化为计算高效的在线线性优化算法,从而解决这一局限性。相应的遗憾与总损失上界具有“加性锐度”,即它们超越了传统的大O最优性,并通过加性低阶项匹配基于正态逼近的下界。我们的构造受到Röllin(2018)关于Wasserstein鞅中心极限定理的异常简洁证明的启发。从这一通用技术中可获得若干具体优势:首先,在相同计算复杂度下,所提算法改进了在线梯度下降与乘性权重更新的总损失上界;其次,我们的算法能实现总损失与比较器最大遗憾之间的连续最优两点权衡,改进了参数无关在线学习领域的先前工作;最后,通过允许对手在无界支撑集上随机化,我们为含噪声反馈的在线线性优化实现了精确的期望性能保证。