Smoothed online learning has emerged as a popular framework to mitigate the substantial loss in statistical and computational complexity that arises when one moves from classical to adversarial learning. Unfortunately, for some spaces, it has been shown that efficient algorithms suffer an exponentially worse regret than that which is minimax optimal, even when the learner has access to an optimization oracle over the space. To mitigate that exponential dependence, this work introduces a new notion of complexity, the generalized bracketing numbers, which marries constraints on the adversary to the size of the space, and shows that an instantiation of Follow-the-Perturbed-Leader can attain low regret with the number of calls to the optimization oracle scaling optimally with respect to average regret. We then instantiate our bounds in several problems of interest, including online prediction and planning of piecewise continuous functions, which has many applications in fields as diverse as econometrics and robotics.
翻译:平滑在线学习已成为一种流行的框架,用于缓解从经典学习转向对抗性学习时产生的统计和计算复杂性的显著损失。不幸的是,对于某些空间,即使学习者有权访问空间上的优化预言机,已有研究表明高效算法的遗憾值比极小化最优遗憾值呈指数级更差。为解决这种指数依赖性,本文引入了一种新的复杂性概念——广义括号数,它将对手的约束与空间大小相结合,并表明遵循扰动领导者算法可以在平均遗憾方面实现与优化预言机调用次数最优成比例的低遗憾。随后,我们在若干感兴趣的问题中实例化我们的界,包括分段连续函数的在线预测和规划,这在计量经济学和机器人学等不同领域具有广泛应用。