In many applications, learning systems are required to process continuous non-stationary data streams. We study this problem in an online learning framework and propose an algorithm that can deal with adversarial time-varying and nonlinear constraints. As we show in our work, the algorithm called Constraint Violation Velocity Projection (CVV-Pro) achieves $\sqrt{T}$ regret and converges to the feasible set at a rate of $1/\sqrt{T}$, despite the fact that the feasible set is slowly time-varying and a priori unknown to the learner. CVV-Pro only relies on local sparse linear approximations of the feasible set and therefore avoids optimizing over the entire set at each iteration, which is in sharp contrast to projected gradients or Frank-Wolfe methods. We also empirically evaluate our algorithm on two-player games, where the players are subjected to a shared constraint.
翻译:在许多应用中,学习系统需要处理连续的非平稳数据流。我们在在线学习框架下研究这一问题,并提出了一种能够应对对抗性时变和非线性约束的算法。我们研究表明,名为约束违反速度投影(CVV-Pro)的算法实现了$\sqrt{T}$的遗憾界,并以$1/\sqrt{T}$的速率收敛到可行集,尽管可行集是缓慢时变且对学习器先验未知的。CVV-Pro仅依赖于可行集的局部稀疏线性逼近,因此避免了每次迭代对整个集合进行优化,这与投影梯度法或Frank-Wolfe方法形成鲜明对比。我们还在双方受共享约束的两玩家博弈中对该算法进行了实证评估。