Stochastic bilevel optimization (SBO) is becoming increasingly essential in machine learning due to its versatility in handling nested structures. To address large-scale SBO, decentralized approaches have emerged as effective paradigms in which nodes communicate with immediate neighbors without a central server, thereby improving communication efficiency and enhancing algorithmic robustness. However, current decentralized SBO algorithms face challenges, including expensive inner-loop updates and unclear understanding of the influence of network topology, data heterogeneity, and the nested bilevel algorithmic structures. In this paper, we introduce a single-loop decentralized SBO (D-SOBA) algorithm and establish its transient iteration complexity, which, for the first time, clarifies the joint influence of network topology and data heterogeneity on decentralized bilevel algorithms. D-SOBA achieves the state-of-the-art asymptotic rate, asymptotic gradient/Hessian complexity, and transient iteration complexity under more relaxed assumptions compared to existing methods. Numerical experiments validate our theoretical findings.
翻译:随机双层优化(SBO)因其处理嵌套结构的灵活性而在机器学习中日益重要。为应对大规模SBO,分布式方法已成为有效范式,节点通过与相邻节点通信(无需中央服务器)来提升通信效率并增强算法鲁棒性。然而,现有分布式SBO算法面临挑战,包括昂贵的内部循环更新,以及对网络拓扑、数据异质性和嵌套双层算法结构影响的认知不清晰。本文提出单循环分布式SBO(D-SOBA)算法,并建立其瞬态迭代复杂度——这是首次阐明网络拓扑与数据异质性对分布式双层算法的联合影响。与现有方法相比,D-SOBA在更宽松的假设下实现了最先进的渐近速率、渐近梯度/海森矩阵复杂度及瞬态迭代复杂度。数值实验验证了我们的理论发现。