We study the multi-agent Smoothed Online Convex Optimization (SOCO) problem, where $N$ agents interact through a communication graph. In each round, each agent $i$ receives a strongly convex hitting cost function $f^i_t$ in an online fashion and selects an action $x^i_t \in \mathbb{R}^d$. The objective is to minimize the global cumulative cost, which includes the sum of individual hitting costs $f^i_t(x^i_t)$, a temporal "switching cost" for changing decisions, and a spatial "dissimilarity cost" that penalizes deviations in decisions among neighboring agents. We propose the first decentralized algorithm for multi-agent SOCO and prove its asymptotic optimality. Our approach allows each agent to operate using only local information from its immediate neighbors in the graph. For finite-time performance, we establish that the optimality gap in competitive ratio decreases with the time horizon $T$ and can be conveniently tuned based on the per-round computation available to each agent. Moreover, our results hold even when the communication graph changes arbitrarily and adaptively over time. Finally, we establish that the computational complexity per round depends only logarithmically on the number of agents and almost linearly on their degree within the graph, ensuring scalability for large-system implementations.
翻译:本文研究多智能体平滑在线凸优化问题,其中 $N$ 个智能体通过通信图进行交互。在每一轮中,每个智能体 $i$ 以在线方式接收到一个强凸的命中代价函数 $f^i_t$,并选择一个动作 $x^i_t \in \mathbb{R}^d$。目标是最小化全局累积代价,该代价包括个体命中代价 $f^i_t(x^i_t)$ 的总和、用于改变决策的时间“切换成本”,以及惩罚相邻智能体间决策偏差的空间“差异成本”。我们提出了首个用于多智能体 SOCO 的去中心化算法,并证明了其渐近最优性。我们的方法允许每个智能体仅使用图中其直接邻居的局部信息进行操作。对于有限时间性能,我们证明了竞争比中的最优性差距随时间范围 $T$ 减小,并且可以根据每个智能体每轮可用的计算量方便地进行调整。此外,即使通信图随时间任意且自适应地变化,我们的结果仍然成立。最后,我们证明了每轮的计算复杂度仅与智能体数量的对数相关,且几乎与其在图中度数的线性相关,从而确保了大规模系统实现的可扩展性。