This work introduces DADAO: the first decentralized, accelerated, asynchronous, primal, first-order algorithm to minimize a sum of $L$-smooth and $\mu$-strongly convex functions distributed over a given network of size $n$. Our key insight is based on modeling the local gradient updates and gossip communication procedures with separate independent Poisson Point Processes. This allows us to decouple the computation and communication steps, which can be run in parallel, while making the whole approach completely asynchronous, leading to communication acceleration compared to synchronous approaches. Our new method employs primal gradients and does not use a multi-consensus inner loop nor other ad-hoc mechanisms such as Error Feedback, Gradient Tracking, or a Proximal operator. By relating the inverse of the smallest positive eigenvalue of the Laplacian matrix $\chi_1$ and the maximal resistance $\chi_2\leq \chi_1$ of the graph to a sufficient minimal communication rate between the nodes of the network, we show that our algorithm requires $\mathcal{O}(n\sqrt{\frac{L}{\mu}}\log(\frac{1}{\epsilon}))$ local gradients and only $\mathcal{O}(n\sqrt{\chi_1\chi_2}\sqrt{\frac{L}{\mu}}\log(\frac{1}{\epsilon}))$ communications to reach a precision $\epsilon$, up to logarithmic terms. Thus, we simultaneously obtain an accelerated rate for both computations and communications, leading to an improvement over state-of-the-art works, our simulations further validating the strength of our relatively unconstrained method. We also propose a SDP relaxation to find the optimal gossip rate of each edge minimizing the total number of communications for a given graph, resulting in faster convergence compared to standard approaches relying on uniform communication weights. Our source code is released on a public repository.
翻译:本文提出DADAO:首个去中心化、加速、异步、原始一阶算法,用于最小化分布在给定规模为$n$的网络上的$L$-光滑且$\mu$-强凸函数之和。我们的关键洞见在于将局部梯度更新和八卦通信过程分别建模为独立的泊松点过程。这使得我们可以解耦计算与通信步骤(两者可并行运行),同时使整个方法完全异步,从而相比同步方法实现通信加速。新方法采用原始梯度,无需多共识内循环或其他特殊机制(如误差反馈、梯度追踪或近端算子)。通过将拉普拉斯矩阵最小正特征值的倒数$\chi_1$和图的极大电阻$\chi_2\leq \chi_1$关联到网络节点间足够的最小通信速率,我们证明该算法在达到精度$\epsilon$时(忽略对数项)需要$\mathcal{O}(n\sqrt{\frac{L}{\mu}}\log(\frac{1}{\epsilon}))$次局部梯度和仅$\mathcal{O}(n\sqrt{\chi_1\chi_2}\sqrt{\frac{L}{\mu}}\log(\frac{1}{\epsilon}))$次通信。因此,我们同时获得了计算和通信的加速率,优于现有工作,仿真结果进一步验证了我们相对无约束方法的优势。我们还提出了一个SDP松弛方法,以优化给定图中每条边的八卦速率,最小化总通信次数,相比依赖均匀通信权重的标准方法收敛更快。我们的源代码已在公共仓库中发布。