We study a variant of online convex optimization where the player is permitted to switch decisions at most $S$ times in expectation throughout $T$ rounds. Similar problems have been addressed in prior work for the discrete decision set setting, and more recently in the continuous setting but only with an adaptive adversary. In this work, we aim to fill the gap and present computationally efficient algorithms in the more prevalent oblivious setting, establishing a regret bound of $O(T/S)$ for general convex losses and $\widetilde O(T/S^2)$ for strongly convex losses. In addition, for stochastic i.i.d.~losses, we present a simple algorithm that performs $\log T$ switches with only a multiplicative $\log T$ factor overhead in its regret in both the general and strongly convex settings. Finally, we complement our algorithms with lower bounds that match our upper bounds in some of the cases we consider.
翻译:我们研究在线凸优化的一种变体,其中玩家在整个$T$轮中期望最多允许进行$S$次决策切换。类似问题已在先前工作中针对离散决策集场景得到解决,近期在连续场景中仅有自适应对手情况下的研究。本文旨在填补这一空白,提出在更常见的非自适应设置下的计算高效算法,为一般凸损失建立$O(T/S)$的遗憾界,为强凸损失建立$\widetilde O(T/S^2)$的遗憾界。此外,对于随机独立同分布损失,我们提出一个简单算法,该算法在一般凸和强凸设置下仅需进行$\log T$次切换,且其遗憾值仅增加一个乘性$\log T$因子。最后,我们为所考虑的部分情形提供与上界匹配的下界作为补充。