Decentralized bilevel optimization has been actively studied in the past few years since it has widespread applications in machine learning. However, existing algorithms suffer from large communication complexity caused by the estimation of stochastic hypergradient, limiting their application to real-world tasks. To address this issue, we develop a novel decentralized stochastic bilevel gradient descent algorithm under the heterogeneous setting, which enjoys a small communication cost in each round and a small number of communication rounds. As such, it can achieve a much better communication complexity than existing algorithms without any strong assumptions regarding heterogeneity. To the best of our knowledge, this is the first stochastic algorithm achieving these theoretical results under the heterogeneous setting. At last, the experimental results confirm the efficacy of our algorithm.
翻译:去中心化双层优化因其在机器学习中的广泛应用而在过去几年受到积极研究。然而,现有算法由于随机超梯度的估计而导致通信复杂度较高,限制了其在实际任务中的应用。为解决这一问题,我们提出了一种新颖的异质性设置下的去中心化随机双层梯度下降算法,该算法每轮通信成本低且通信轮次少。因此,在不对异质性作任何强假设的前提下,其通信复杂度显著优于现有算法。据我们所知,这是首个在异质性设置下实现这些理论结果的随机算法。最后,实验结果证实了我们算法的有效性。