We consider decentralized gradient-free optimization of minimizing Lipschitz continuous functions that satisfy neither smoothness nor convexity assumption. We propose two novel gradient-free algorithms, the Decentralized Gradient-Free Method (DGFM) and its variant, the Decentralized Gradient-Free Method$^+$ (DGFM$^{+}$). Based on the techniques of randomized smoothing and gradient tracking, DGFM requires the computation of the zeroth-order oracle of a single sample in each iteration, making it less demanding in terms of computational resources for individual computing nodes. Theoretically, DGFM achieves a complexity of $\mathcal O(d^{3/2}\delta^{-1}\varepsilon ^{-4})$ for obtaining an $(\delta,\varepsilon)$-Goldstein stationary point. DGFM$^{+}$, an advanced version of DGFM, incorporates variance reduction to further improve the convergence behavior. It samples a mini-batch at each iteration and periodically draws a larger batch of data, which improves the complexity to $\mathcal O(d^{3/2}\delta^{-1} \varepsilon^{-3})$. Moreover, experimental results underscore the empirical advantages of our proposed algorithms when applied to real-world datasets.
翻译:本文考虑在既不满足光滑性也不满足凸性假设下,对Lipschitz连续函数进行最小化的分散式无梯度优化。我们提出了两种新型无梯度算法——分散式无梯度方法(DGFM)及其变体分散式无梯度方法$^+$(DGFM$^{+}$)。基于随机平滑和梯度追踪技术,DGFM在每次迭代中仅需计算单个样本的零阶Oracle,从而降低了对单个计算节点的计算资源需求。理论上,DGFM在获得$(\delta,\varepsilon)-Goldstein$驻点时的复杂度为$\mathcal O(d^{3/2}\delta^{-1}\varepsilon ^{-4})$。作为DGFM的进阶版本,DGFM$^{+}$通过引入方差缩减进一步改进了收敛性能。它在每次迭代中采样一个小批量数据,并定期抽取更大批量的数据,从而将复杂度提升至$\mathcal O(d^{3/2}\delta^{-1} \varepsilon^{-3})$。此外,实验结果表明我们提出的算法在实际数据集上具有显著的实证优势。