In this work, we give a statistical characterization of the $\gamma$-regret for arbitrary structured bandit problems, the regret which arises when comparing against a benchmark that is $\gamma$ times the optimal solution. The $\gamma$-regret emerges in structured bandit problems over a function class $\mathcal{F}$ where finding an exact optimum of $f \in \mathcal{F}$ is intractable. Our characterization is given in terms of the $\gamma$-DEC, a statistical complexity parameter for the class $\mathcal{F}$, which is a modification of the constrained Decision-Estimation Coefficient (DEC) of Foster et al., 2023 (and closely related to the original offset DEC of Foster et al., 2021). Our lower bound shows that the $\gamma$-DEC is a fundamental limit for any model class $\mathcal{F}$: for any algorithm, there exists some $f \in \mathcal{F}$ for which the $\gamma$-regret of that algorithm scales (nearly) with the $\gamma$-DEC of $\mathcal{F}$. We provide an upper bound showing that there exists an algorithm attaining a nearly matching $\gamma$-regret. Due to significant challenges in applying the prior results on the DEC to the $\gamma$-regret case, both our lower and upper bounds require novel techniques and a new algorithm.
翻译:本文对任意结构化赌博机问题中的$γ$-遗憾(与基准的$γ$倍最优解进行比较时产生的遗憾)给出了统计刻画。该$γ$-遗憾出现在函数类$\mathcal{F}$上的结构化赌博机问题中,此时寻找$f\in\mathcal{F}$的精确最优解是难以处理的。我们的刻画通过$γ$-DEC实现,这是针对函数类$\mathcal{F}$的统计复杂度参数,它改进了Foster等人(2023)提出的约束型决策-估计系数(与Foster等人(2021)原始偏移DEC密切相关)。下界分析表明,$γ$-DEC是任何模型类$\mathcal{F}$的基本极限:对于任意算法,总存在某个$f\in\mathcal{F}$使得该算法的$γ$-遗憾(近乎)随$\mathcal{F}$的$γ$-DEC尺度变化。我们进一步给出上界,证明存在算法能够达到近乎匹配的$γ$-遗憾。由于将DEC的既有结果应用于$γ$-遗憾情形时面临显著挑战,本文的下界与上界均需引入新颖的技术方法及全新的算法。