In many applications, learning systems are required to process continuous non-stationary data streams. We study this problem in an online learning framework and propose an algorithm that can deal with adversarial time-varying and nonlinear constraints. As we show in our work, the algorithm called Constraint Violation Velocity Projection (CVV-Pro) achieves $\sqrt{T}$ regret and converges to the feasible set at a rate of $1/\sqrt{T}$, despite the fact that the feasible set is slowly time-varying and a priori unknown to the learner. CVV-Pro only relies on local sparse linear approximations of the feasible set and therefore avoids optimizing over the entire set at each iteration, which is in sharp contrast to projected gradients or Frank-Wolfe methods. We also empirically evaluate our algorithm on two-player games, where the players are subjected to a shared constraint.
翻译:在许多应用中,学习系统需要处理连续的非平稳数据流。我们在在线学习框架下研究这一问题,并提出一种能够应对对抗性时变非线性约束的算法。如我们工作中所示,该算法名为约束违反速度投影法(CVV-Pro),在可行集缓慢时变且学习者先验未知的情况下,实现了$\sqrt{T}$的遗憾界,并以$1/\sqrt{T}$的速率收敛到可行集。CVV-Pro仅依赖可行集的局部稀疏线性近似,因此避免了每次迭代对整个集合进行优化——这与投影梯度法或Frank-Wolfe方法形成鲜明对比。我们还在两个玩家受到共享约束的二人博弈中对该算法进行了实证评估。