Bilevel optimization has recently regained interest owing to its applications in emerging machine learning fields such as hyperparameter optimization, meta-learning, and reinforcement learning. Recent results have shown that simple alternating (implicit) gradient-based algorithms can achieve the same convergence rate of single-level gradient descent (GD) for bilevel problems with a strongly convex lower-level objective. However, it remains unclear whether this result can be generalized to bilevel problems beyond this basic setting. In this paper, we propose a Generalized ALternating mEthod for bilevel opTimization (GALET) with a nonconvex lower-level objective that satisfies the Polyak-{\L}ojasiewicz (PL) condition. We first introduce a stationary metric for the considered bilevel problems, which generalizes the existing metric. We then establish that GALET achieves an $\epsilon$-stationary metric for the considered problem within $\tilde{\cal O}(\epsilon^{-1})$ iterations, which matches the iteration complexity of GD for smooth nonconvex problems.
翻译:双层优化因其在超参数优化、元学习和强化学习等新兴机器学习领域的应用,近期重新引起关注。已有结果表明,对于下层目标函数强凸的双层问题,简单的交替隐式梯度算法能够达到与单层梯度下降法相同的收敛速率。然而,这一结果能否推广至超越该基本设置的双层问题仍不清楚。本文针对下层目标函数满足Polyak-Łojasiewicz(PL)条件的非凸双层优化问题,提出了一种广义交替优化方法(GALET)。我们首先引入一个考虑双层问题的平稳性度量,该度量推广了现有度量标准。随后证明,GALET能在$\tilde{\cal O}(\epsilon^{-1})$次迭代内达到所考虑问题的$\epsilon$-平稳性度量,该复杂度与光滑非凸问题的梯度下降法迭代复杂度相匹配。