In this paper, we study the distributed experts problem, where $n$ experts are distributed across $s$ servers for $T$ timesteps. The loss of each expert at each time $t$ is the $\ell_p$ norm of the vector that consists of the losses of the expert at each of the $s$ servers at time $t$. The goal is to minimize the regret $R$, i.e., the loss of the distributed protocol compared to the loss of the best expert, amortized over the all $T$ times, while using the minimum amount of communication. We give a protocol that achieves regret roughly $R\gtrsim\frac{1}{\sqrt{T}\cdot\text{poly}\log(nsT)}$, using $\mathcal{O}\left(\frac{n}{R^2}+\frac{s}{R^2}\right)\cdot\max(s^{1-2/p},1)\cdot\text{poly}\log(nsT)$ bits of communication, which improves on previous work.
翻译:本文研究分布式专家问题,其中$n$个专家分布在$s$个服务器上运行$T$个时间步。每个专家在时刻$t$的损失为其在$s$个服务器上损失向量在$\ell_p$范数下的取值。目标是在最小化通信开销的同时,使遗憾值$R$(即分布式协议相对于最优专家在$T$个时间段内的累计损失差距)尽可能小。我们提出一种协议,在通信开销为$\mathcal{O}\left(\frac{n}{R^2}+\frac{s}{R^2}\right)\cdot\max(s^{1-2/p},1)\cdot\text{poly}\log(nsT)$比特的条件下,实现遗憾值约$R\gtrsim\frac{1}{\sqrt{T}\cdot\text{poly}\log(nsT)}$,该结果改进了已有研究成果。