We prove that a classic sub-Gaussian mixture proposed by Robbins in a stochastic setting actually satisfies a path-wise (deterministic) regret bound. For every path in a natural ``Ville event'' $E_α$, this regret till time $T$ is bounded by $\ln^2(1/α)/V_T + \ln (1/α) + \ln \ln V_T$ up to universal constants, where $V_T$ is a nonnegative, nondecreasing, cumulative variance process. (The bound reduces to $\ln(1/α) + \ln \ln V_T$ if $V_T \geq \ln(1/α)$.) If the data were stochastic, then one can show that $E_α$ has probability at least $1-α$ under a wide class of distributions (eg: sub-Gaussian, symmetric, variance-bounded, etc.). In fact, we show that on the Ville event $E_0$ of probability one, the regret on every path in $E_0$ is eventually bounded by $\ln \ln V_T$ (up to constants). We explain how this work helps bridge the world of adversarial online learning (which usually deals with regret bounds for bounded data), with game-theoretic statistics (which can handle unbounded data, albeit using stochastic assumptions). In short, conditional regret bounds serve as a bridge between stochastic and adversarial betting.
翻译:我们证明,Robbins在随机设置中提出的经典亚高斯混合模型实际上满足路径(确定性)遗憾界。在自然“Ville事件”$E_α$中的每条路径上,截至时间$T$的遗憾在通用常数范围内以$\ln^2(1/α)/V_T + \ln (1/α) + \ln \ln V_T$为界,其中$V_T$为非负非降累积方差过程。(若$V_T \geq \ln(1/α)$,该界简化为$\ln(1/α) + \ln \ln V_T$。)若数据具有随机性,则可证明在广泛分布类(如:亚高斯分布、对称分布、方差有界分布等)下$E_α$的概率至少为$1-α$。事实上,我们证明在概率为1的Ville事件$E_0$上,$E_0$中每条路径的遗憾最终以$\ln \ln V_T$为界(在常数范围内)。本文阐释了该成果如何连接对抗性在线学习(通常处理有界数据的遗憾界)与博弈论统计学(虽依赖随机性假设,但能处理无界数据)这两个领域。简言之,条件遗憾界构成了随机性与对抗性博弈之间的桥梁。