We consider the problem setting of prediction with expert advice with possibly heavy-tailed losses, i.e.\ the only assumption on the losses is an upper bound on their second moments, denoted by $\theta$. We develop adaptive algorithms that do not require any prior knowledge about the range or the second moment of the losses. Existing adaptive algorithms have what is typically considered a lower-order term in their regret guarantees. We show that this lower-order term, which is often the maximum of the losses, can actually dominate the regret bound in our setting. Specifically, we show that even with small constant $\theta$, this lower-order term can scale as $\sqrt{KT}$, where $K$ is the number of experts and $T$ is the time horizon. We propose adaptive algorithms with improved regret bounds that avoid the dependence on such a lower-order term and guarantee $\mathcal{O}(\sqrt{\theta T\log(K)})$ regret in the worst case, and $\mathcal{O}(\theta \log(KT)/\Delta_{\min})$ regret when the losses are sampled i.i.d.\ from some fixed distribution, where $\Delta_{\min}$ is the difference between the mean losses of the second best expert and the best expert. Additionally, when the loss function is the squared loss, our algorithm also guarantees improved regret bounds over prior results.
翻译:我们考虑在可能具有重尾损失的专家建议预测问题设置下进行研究,即对损失函数仅假设其二阶矩存在上界,记为$\theta$。我们开发了无需任何关于损失范围或二阶矩先验知识的自适应算法。现有自适应算法在其遗憾保证中通常包含一个通常被视为低阶项的组成部分。我们证明,在本文设定下,这个低阶项(通常为损失最大值)实际上可能主导遗憾界。具体而言,我们证明即使$\theta$为较小的常数,该低阶项仍可能按$\sqrt{KT}$的尺度增长,其中$K$表示专家数量,$T$表示时间范围。我们提出了具有改进遗憾界的自适应算法,避免了对这类低阶项的依赖,在最坏情况下保证$\mathcal{O}(\sqrt{\theta T\log(K)})$的遗憾,当损失从某个固定分布独立同分布采样时保证$\mathcal{O}(\theta \log(KT)/\Delta_{\min})$的遗憾,其中$\Delta_{\min}$表示次优专家与最优专家平均损失之间的差值。此外,当损失函数为平方损失时,我们的算法相比现有结果也能保证更优的遗憾界。