Stochastic Bilevel optimization usually involves minimizing an upper-level (UL) function that is dependent on the arg-min of a strongly-convex lower-level (LL) function. Several algorithms utilize Neumann series to approximate certain matrix inverses involved in estimating the implicit gradient of the UL function (hypergradient). The state-of-the-art StOchastic Bilevel Algorithm (SOBA) [16] instead uses stochastic gradient descent steps to solve the linear system associated with the explicit matrix inversion. This modification enables SOBA to match the lower bound of sample complexity for the single-level counterpart in non-convex settings. Unfortunately, the current analysis of SOBA relies on the assumption of higher-order smoothness for the UL and LL functions to achieve optimality. In this paper, we introduce a novel fully single-loop and Hessian-inversion-free algorithmic framework for stochastic bilevel optimization and present a tighter analysis under standard smoothness assumptions (first-order Lipschitzness of the UL function and second-order Lipschitzness of the LL function). Furthermore, we show that by a slight modification of our approach, our algorithm can handle a more general multi-objective robust bilevel optimization problem. For this case, we obtain the state-of-the-art oracle complexity results demonstrating the generality of both the proposed algorithmic and analytic frameworks. Numerical experiments demonstrate the performance gain of the proposed algorithms over existing ones.
翻译:随机双层优化通常涉及最小化一个依赖于强凸下层(LL)函数arg-min的上层(UL)函数。现有若干算法利用Neumann级数来近似估计UL函数隐式梯度(超梯度)所涉及的某些矩阵逆。而最新提出的StOchastic Bilevel Algorithm (SOBA) [16] 则采用随机梯度下降步骤求解与显式矩阵求逆相关的线性系统。这一改进使得SOBA在非凸场景下能够匹配单层优化问题的样本复杂度下界。遗憾的是,当前对SOBA的分析依赖于UL和LL函数具有高阶光滑性的假设才能达到最优性。本文提出了一种新颖的全单循环且无需Hessian矩阵求逆的随机双层优化算法框架,并在标准光滑性假设(UL函数一阶Lipschitz连续性和LL函数二阶Lipschitz连续性)下给出了更紧凑的分析。进一步地,我们证明通过对方法进行简单修改,所提算法可以处理更一般的多目标鲁棒双层优化问题。针对该情形,我们获得了最新的预言复杂度结果,充分展现了所提算法与分析框架的通用性。数值实验验证了所提算法相较于现有方法的性能提升。