We study Online Convex Optimization (OCO) with adversarial constraints, where an online algorithm must make repeated decisions to minimize both convex loss functions and cumulative constraint violations. We focus on a setting where the algorithm has access to predictions of the loss and constraint functions. Our results show that we can improve the current best bounds of $ O(\sqrt{T}) $ regret and $ \tilde{O}(\sqrt{T}) $ cumulative constraint violations to $ O(\sqrt{E_T(f)}) $ and $ \tilde{O}(\sqrt{E_T(g)}) $, respectively, where $ E_T(f) $ and $ E_T(g) $ represent the cumulative prediction errors of the loss and constraint functions. In the worst case, where $ E_T(f) = O(T) $ and $ E_T(g) = O(T) $ (assuming bounded loss and constraint functions), our rates match the prior $ O(\sqrt{T}) $ results. However, when the loss and constraint predictions are accurate, our approach yields significantly smaller regret and cumulative constraint violations. Notably, if the constraint function remains constant over time, we achieve $ \tilde{O}(1) $ cumulative constraint violation, aligning with prior results.
翻译:本文研究对抗约束下的在线凸优化问题,其中在线算法必须通过重复决策来最小化凸损失函数与累积约束违反量。我们关注算法能够获取损失函数与约束函数预测信息的情形。研究结果表明,我们可以将当前最优的$O(\sqrt{T})$遗憾界与$\tilde{O}(\sqrt{T})$累积约束违反界分别改进为$O(\sqrt{E_T(f)})$与$\tilde{O}(\sqrt{E_T(g)})$,其中$E_T(f)$和$E_T(g)$分别表示损失函数与约束函数的累积预测误差。在最坏情况下(假设损失函数与约束函数有界),当$E_T(f) = O(T)$且$E_T(g) = O(T)$时,我们的收敛速率与现有$O(\sqrt{T})$结果一致。然而当损失与约束预测较为准确时,我们的方法能产生显著更小的遗憾值与累积约束违反量。特别值得注意的是,若约束函数随时间保持不变,我们可获得$\tilde{O}(1)$的累积约束违反量,这与已有研究结论相符。