We propose a novel family of decision-aware surrogate losses, called Perturbation Gradient (PG) losses, for the predict-then-optimize framework. The key idea is to connect the expected downstream decision loss with the directional derivative of a particular plug-in objective, and then approximate this derivative using zeroth order gradient techniques. Unlike the original decision loss which is typically piecewise constant and discontinuous, our new PG losses is a Lipschitz continuous, difference of concave functions that can be optimized using off-the-shelf gradient-based methods. Most importantly, unlike existing surrogate losses, the approximation error of our PG losses vanishes as the number of samples grows. Hence, optimizing our surrogate loss yields a best-in-class policy asymptotically, even in misspecified settings. This is the first such result in misspecified settings, and we provide numerical evidence confirming our PG losses substantively outperform existing proposals when the underlying model is misspecified.
翻译:我们为预测-优化框架提出了一类新颖的决策感知代理损失函数,称为扰动梯度损失。其核心思想是将下游期望决策损失与特定插值目标函数的方向导数相关联,进而利用零阶梯度技术对该导数进行近似。与通常为分段常数且不连续的原始决策损失不同,我们提出的扰动梯度损失是利普希茨连续的凹函数差,可直接使用现成的基于梯度的方法进行优化。最重要的是,与现有代理损失不同,我们的扰动梯度损失的近似误差会随着样本数量的增加而消失。因此,即使在模型设定错误的场景下,优化我们的代理损失也能渐近地获得最优策略。这是在模型设定错误场景下的首个此类理论结果,我们提供的数值实验证实,当基础模型存在设定错误时,我们的扰动梯度损失在性能上显著优于现有方案。