To expand the applicability of decentralized online learning, previous studies have proposed several algorithms for decentralized online continuous submodular maximization (D-OCSM) -- a non-convex/non-concave setting with continuous DR-submodular reward functions. However, there exist large gaps between their approximate regret bounds and the regret bounds achieved in the convex setting. Moreover, if focusing on projection-free algorithms, which can efficiently handle complex decision sets, they cannot even recover the approximate regret bounds achieved in the centralized setting. In this paper, we first demonstrate that for D-OCSM over general convex decision sets, these two issues can be addressed simultaneously. Furthermore, for D-OCSM over downward-closed decision sets, we show that the second issue can be addressed while significantly alleviating the first issue. Our key techniques are two reductions from D-OCSM to decentralized online convex optimization (D-OCO), which can exploit D-OCO algorithms to improve the approximate regret of D-OCSM in these two cases, respectively.
翻译:为拓展分散式在线学习的应用范围,先前研究已针对分散式在线连续子模最大化(D-OCSM)——一种具有连续DR-子模奖励函数的非凸/非凹设定——提出了若干算法。然而,其近似遗憾界与凸设定下取得的遗憾界之间存在较大差距。此外,若聚焦于能高效处理复杂决策集的无投影算法,它们甚至无法复现集中式设定下取得的近似遗憾界。本文首先证明,对于一般凸决策集上的D-OCSM,这两个问题可被同步解决。进一步地,对于向下封闭决策集上的D-OCSM,我们表明第二个问题可被解决,同时第一个问题能得到显著缓解。我们的核心技术是两种从D-OCSM到分散式在线凸优化(D-OCO)的归约方法,它们能分别利用D-OCO算法在这两种情形下改进D-OCSM的近似遗憾界。