We consider the problem setting of prediction with expert advice with possibly heavy-tailed losses, i.e. the only assumption on the losses is an upper bound on their second moments, denoted by $θ$. We develop adaptive algorithms that do not require any prior knowledge about the range or the second moment of the losses. Existing adaptive algorithms have what is typically considered a lower-order term in their regret guarantees. We show that this lower-order term, which is often the maximum of the losses, can actually dominate the regret bound in our setting. Specifically, we show that even with small constant $θ$, this lower-order term can scale as $\sqrt{KT}$, where $K$ is the number of experts and $T$ is the time horizon. We propose adaptive algorithms with improved regret bounds that avoid the dependence on such a lower-order term and guarantee $\mathcal{O}(\sqrt{θT\log(K)})$ regret in the worst case, and $\mathcal{O}(θ\log(KT)/Δ_{\min})$ regret when the losses are sampled i.i.d. from some fixed distribution, where $Δ_{\min}$ is the difference between the mean losses of the second best expert and the best expert. Additionally, when the loss function is the squared loss, our algorithm also guarantees improved regret bounds over prior results.
翻译:我们考虑在可能具有重尾损失的专家建议预测问题设置下进行研究,即对损失函数的唯一假设是其二阶矩存在上界,记为$θ$。我们开发了自适应算法,这些算法无需任何关于损失范围或二阶矩的先验知识。现有的自适应算法在其遗憾保证中通常包含一个通常被视为低阶项的组成部分。我们证明,在我们的设置中,这个低阶项(通常是损失的最大值)实际上可能主导遗憾界。具体而言,我们证明即使$θ$为较小的常数,该低阶项也可能按$\sqrt{KT}$的尺度增长,其中$K$是专家数量,$T$是时间范围。我们提出了具有改进遗憾界的自适应算法,避免了对这种低阶项的依赖,并在最坏情况下保证$\mathcal{O}(\sqrt{θT\log(K)})$的遗憾,当损失从某个固定分布独立同分布采样时,保证$\mathcal{O}(θ\log(KT)/Δ_{\min})$的遗憾,其中$Δ_{\min}$是次优专家与最优专家平均损失之间的差值。此外,当损失函数为平方损失时,我们的算法相比先前结果也能保证更优的遗憾界。