We study a variant of online convex optimization where the player is permitted to switch decisions at most $S$ times in expectation throughout $T$ rounds. Similar problems have been addressed in prior work for the discrete decision set setting, and more recently in the continuous setting but only with an adaptive adversary. In this work, we aim to fill the gap and present computationally efficient algorithms in the more prevalent oblivious setting, establishing a regret bound of $O(T/S)$ for general convex losses and $\widetilde O(T/S^2)$ for strongly convex losses. In addition, for stochastic i.i.d.~losses, we present a simple algorithm that performs $\log T$ switches with only a multiplicative $\log T$ factor overhead in its regret in both the general and strongly convex settings. Finally, we complement our algorithms with lower bounds that match our upper bounds in some of the cases we consider.
翻译:我们研究了一种在线凸优化的变体问题,其中玩家在$T$轮迭代中,期望的决策切换次数最多为$S$次。类似问题已在离散决策集设定中有所研究,近期在连续设定中也有涉及,但仅针对自适应对手。本文旨在填补这一空白,在更常见的 oblivious 设定下提出计算高效的算法,对于一般凸损失函数得到$O(T/S)$的遗憾界,对于强凸损失函数得到$\widetilde O(T/S^2)$的遗憾界。此外,对于随机独立同分布(i.i.d.)损失,我们提出一种简单算法,仅需$\log T$次切换,且在一般凸和强凸设定下,其遗憾值仅增加一个$\log T$的乘法因子。最后,我们给出与部分情形下上界相匹配的下界,以补充算法的理论分析。