Many online decision problems over combinatorial actions are addressed via convex relaxations, leading to online convex optimization with piecewise linear objectives and induced polyhedral structure. We show that regret in such problems is governed by \emph{polyhedral instability}: the number of changes of the active region. Under full information feedback and fixed partition assumptions, if $\mathrm{RS}_T$ denotes the number of region switches and $V_{\max}$ the maximum number of vertices per region, we prove $\Regret_T= Θ(\sqrt{(1+\mathrm{RS}_T)\,T\,\log V_{\max}})$ interpolating between experts-like and dimension-dependent OCO rates. For online submodular--concave games under Lovász convexification, this reduces to the permutation-switch count $\mathrm{SC}_T$, yielding the matching rate $\Regret_T= Θ(\sqrt{(1+\mathrm{SC}_T)\,T\,\log n})$. Experiments on synthetic and real combinatorial problems (shortest path, influence maximization) validate the predicted scaling and indicate that low-instability regimes can arise in practice without explicit enumeration of actions.
翻译:许多组合动作上的在线决策问题通过凸松弛来解决,导致带分段线性目标函数和诱导多面体结构的在线凸优化。我们表明,这类问题的遗憾由\emph{多面体不稳定性}决定:即活跃区域变化的次数。在完全信息反馈和固定划分的假设下,如果用$\mathrm{RS}_T$表示区域切换次数,$V_{\max}$表示每个区域的最大顶点数,我们证明$\Regret_T= Θ(\sqrt{(1+\mathrm{RS}_T)\,T\,\log V_{\max}})$,这在内行专家式速率和维度依赖的在线凸优化速率之间插值。对于在Lovász凸化下的在线子模-凹博弈,这简化为排列切换次数$\mathrm{SC}_T$,产生匹配速率$\Regret_T= Θ(\sqrt{(1+\mathrm{SC}_T)\,T\,\log n})$。在合成和真实组合问题(最短路径、影响力最大化)上的实验验证了预测的标度,并表明低不稳定性情形在实际中无需显式枚举动作即可出现。