This paper studies distributed online convex optimization with time-varying coupled constraints, motivated by distributed online control in network systems. Most prior work assumes a separability condition: the global objective and coupled constraint functions are sums of local costs and individual constraints. In contrast, we study a group of agents, networked via a communication graph, that collectively select actions to minimize a sequence of nonseparable global cost functions and to satisfy nonseparable long-term constraints based on full-information feedback and intra-agent communication. We propose a distributed online primal-dual belief consensus algorithm, where each agent maintains and updates a local belief of the global collective decisions, which are repeatedly exchanged with neighboring agents. Unlike the previous consensus primal-dual algorithms under separability that ask agents to only communicate their local decisions, our belief-sharing protocol eliminates coupling between the primal consensus disagreement and the dual constraint violation, yielding sublinear regret and cumulative constraint violation (CCV) bounds, both in $O({T}^{1/2})$, where $T$ denotes the time horizon. Such a result breaks the long-standing $O(T^{3/4})$ barrier for CCV and matches the lower bound of online constrained convex optimization, indicating the online learning efficiency at the cost of communication overhead.
翻译:本文研究具有时变耦合约束的分布式在线凸优化问题,其动机源于网络系统中的分布式在线控制。先前研究大多假设可分离条件:全局目标函数与耦合约束函数均为局部成本与个体约束之和。与此不同,我们研究一组通过通信图网络连接的智能体,它们基于全信息反馈与智能体间通信,协同选择动作以最小化一系列不可分离的全局成本函数,并满足不可分离的长期约束。我们提出一种分布式在线原始-对偶信念共识算法,其中每个智能体维护并更新其对全局集体决策的局部信念,这些信念与相邻智能体进行重复交换。与先前可分离条件下仅要求智能体通信其局部决策的共识原始-对偶算法不同,我们的信念共享协议消除了原始共识偏差与对偶约束违反之间的耦合,从而获得次线性的遗憾与累积约束违反(CCV)界,两者均为$O({T}^{1/2})$,其中$T$表示时间范围。这一结果突破了CCV长期存在的$O(T^{3/4})$障碍,并匹配在线约束凸优化的下界,表明该算法以通信开销为代价实现了在线学习效率。