Bilevel Optimization Programming is used to model complex and conflicting interactions between agents, for example in Robust AI or Privacy-preserving AI. Integrating bilevel mathematical programming within deep learning is thus an essential objective for the Machine Learning community. Previously proposed approaches only consider single-level programming. In this paper, we extend existing single-level optimization programming approaches and thus propose Differentiating through Bilevel Optimization Programming (BiGrad) for end-to-end learning of models that use Bilevel Programming as a layer. BiGrad has wide applicability and can be used in modern machine learning frameworks. BiGrad is applicable to both continuous and combinatorial Bilevel optimization problems. We describe a class of gradient estimators for the combinatorial case which reduces the requirements in terms of computation complexity; for the case of the continuous variable, the gradient computation takes advantage of the push-back approach (i.e. vector-jacobian product) for an efficient implementation. Experiments show that the BiGrad successfully extends existing single-level approaches to Bilevel Programming.
翻译:双层优化编程用于建模智能体之间复杂且冲突的交互,例如在鲁棒人工智能或隐私保护人工智能中。因此,将双层数学规划集成到深度学习中,是机器学习领域的一个重要目标。此前提出的方法仅考虑单层规划。本文扩展了现有的单层优化编程方法,提出通过双层优化编程(BiGrad)进行微分,用于端到端学习那些将双层规划作为层的模型。BiGrad具有广泛的适用性,可用于现代机器学习框架。BiGrad适用于连续和组合型的双层优化问题。我们针对组合情况描述了一类梯度估计器,它降低了计算复杂度的要求;对于连续变量的情况,梯度计算利用反向传播方法(即向量-雅可比积)以实现高效实现。实验表明,BiGrad成功地将现有的单层方法扩展至双层规划。