Bilevel optimization reveals the inner structure of otherwise oblique optimization problems, such as hyperparameter tuning and meta-learning. A common goal in bilevel optimization is to find stationary points of the hyper-objective function. Although this hyper-objective approach is widely used, its theoretical properties have not been thoroughly investigated in cases where the lower-level functions lack strong convexity. In this work, we take a step forward and study the hyper-objective approach without the typical lower-level strong convexity assumption. Our hardness results show that the hyper-objective of general convex lower-level functions can be intractable either to evaluate or to optimize. To tackle this challenge, we introduce the gradient dominant condition, which strictly relaxes the strong convexity assumption by allowing the lower-level solution set to be non-singleton. Under the gradient dominant condition, we propose the Inexact Gradient-Free Method (IGFM), which uses the Switching Gradient Method (SGM) as the zeroth order oracle, to find an approximate stationary point of the hyper-objective. We also extend our results to nonsmooth lower-level functions under the weak sharp minimum condition.
翻译:双层优化揭示了超参数调优和元学习等原本隐晦优化问题的内在结构。双层优化的常见目标是寻找超目标函数的稳定点。尽管这种超目标方法被广泛使用,但其在下层函数缺乏强凸性情况下的理论性质尚未得到充分研究。本文在这一方向上取得进展,研究了无需典型下层强凸性假设的超目标方法。我们的困难性结果表明,一般凸下层函数的超目标在评估或优化上可能是难以处理的。为应对这一挑战,我们引入了梯度主导条件,该条件通过允许下层解集为非单点集来严格放宽强凸性假设。在梯度主导条件下,我们提出了非精确无梯度方法(IGFM),该方法使用切换梯度方法(SGM)作为零阶预言,以寻找超目标的近似稳定点。我们还将结果推广到满足弱尖锐最小值条件的非光滑下层函数。