Decentralized Multi-Task Online Convex Optimization Under Random Link Failures

Decentralized optimization methods often entail information exchange between neighbors. Transmission failures can happen due to network congestion, hardware/software issues, communication outage, and other factors. In this paper, we investigate the random link failure problem in decentralized multi-task online convex optimization, where agents have individual decisions that are coupled with each other via pairwise constraints. Although widely used in constrained optimization, conventional saddle-point algorithms are not directly applicable here because of random packet dropping. To address this issue, we develop a robust decentralized saddle-point algorithm against random link failures with heterogeneous probabilities by replacing the missing decisions of neighbors with their latest received values. Then, by judiciously bounding the accumulated deviation stemming from this replacement, we first establish that our algorithm achieves $\mathcal{O}(\sqrt{T})$ regret and $\mathcal{O}(T^\frac{3}{4})$ constraint violations for the full information scenario, where the complete information on the local cost function is revealed to each agent at the end of each time slot. These two bounds match, in order sense, the performance bounds of algorithms with perfect communications. Further, we extend our algorithm and analysis to the two-point bandit feedback scenario, where only the values of the local cost function at two random points are disclosed to each agent sequentially. Performance bounds of the same orders as the full information case are derived. Finally, we corroborate the efficacy of the proposed algorithms and the analytical results through numerical simulations.

翻译：分布式优化方法通常涉及相邻节点间的信息交换。由于网络拥塞、硬件/软件问题、通信中断等因素，传输故障时有发生。本文研究了分布式多任务在线凸优化中的随机链路故障问题，其中各智能体具有通过成对约束相互耦合的独立决策。尽管传统鞍点算法在约束优化中广泛应用，但随机丢包使其难以直接适用。为解决该问题，我们提出了一种对异构概率随机链路故障具有鲁棒性的分布式鞍点算法，通过缺失相邻决策的最近接收值进行替换。通过精妙地界定该替换引起的累积偏差，我们首先证明：在完全信息场景下（每个时隙结束后各智能体可获得本地成本函数的完整信息），所提算法实现了$\mathcal{O}(\sqrt{T})$的遗憾界和$\mathcal{O}(T^\frac{3}{4})$的约束违反界。这两个界在阶数意义上与完美通信场景下算法的性能界相匹配。进一步，我们将算法与分析扩展至两点对偶反馈场景（各智能体仅能依次获得两个随机点上本地成本函数的值），并推导出与完全信息场景同阶的性能界。最后，通过数值仿真验证了所提算法及分析结果的有效性。