In this paper, we study the predict-then-optimize problem where the output of a machine learning prediction task is used as the input of some downstream optimization problem, say, the objective coefficient vector of a linear program. The problem is also known as predictive analytics or contextual linear programming. The existing approaches largely suffer from either (i) optimization intractability (a non-convex objective function)/statistical inefficiency (a suboptimal generalization bound) or (ii) requiring strong condition(s) such as no constraint or loss calibration. We develop a new approach to the problem called \textit{maximum optimality margin} which designs the machine learning loss function by the optimality condition of the downstream optimization. The max-margin formulation enjoys both computational efficiency and good theoretical properties for the learning procedure. More importantly, our new approach only needs the observations of the optimal solution in the training data rather than the objective function, which makes it a new and natural approach to the inverse linear programming problem under both contextual and context-free settings; we also analyze the proposed method under both offline and online settings, and demonstrate its performance using numerical experiments.
翻译:本文研究预测-然后优化问题,其中机器学习预测任务的输出被用作下游优化问题(如线性规划的目标系数向量)的输入。该问题亦称为预测分析或上下文线性规划。现有方法主要面临以下困境:(i) 优化不可解性(非凸目标函数)/统计低效性(次优泛化界),或(ii) 需要强条件(如无约束或损失校准)。我们提出一种称为“最大最优性边界”的新方法,通过下游优化的最优性条件设计机器学习损失函数。该最大间隔公式在计算效率和理论性质上均对学习过程具有显著优势。更重要的是,新方法仅需训练数据中观测到最优解,而非目标函数,使其成为在上下文和无上下文设定下处理逆线性规划问题的一种新颖且自然的途径;我们还分析了所提方法在离线与在线设定下的表现,并通过数值实验验证其性能。