Decentralized bilevel optimization has been actively studied in the past few years since it has widespread applications in machine learning. However, existing algorithms suffer from large communication complexity caused by the estimation of stochastic hypergradient, limiting their application to real-world tasks. To address this issue, we develop a novel decentralized stochastic bilevel gradient descent algorithm under the heterogeneous setting, which enjoys a small communication cost in each round and a small number of communication rounds. As such, it can achieve a much better communication complexity than existing algorithms without any strong assumptions regarding heterogeneity. To the best of our knowledge, this is the first stochastic algorithm achieving these theoretical results under the heterogeneous setting. At last, the experimental results confirm the efficacy of our algorithm.
翻译:去中心化双层优化因其在机器学习中的广泛应用,近年来受到广泛关注。然而,现有算法由于随机超梯度估计导致通信复杂度较高,限制了其在现实任务中的应用。为解决此问题,我们提出了一种新颖的异质环境下去中心化随机双层梯度下降算法,该算法每轮通信成本低且通信轮数少。因此,该算法能在无需对异质性做出强假设的前提下,实现远优于现有算法的通信复杂度。据我们所知,这是首个在异质环境下取得这些理论结果的随机算法。最后,实验结果验证了我们算法的有效性。