A recent line of structured learning methods has advanced the practical state-of-the-art for combinatorial optimization problems with complex, application-specific objectives. These approaches learn policies that couple a statistical model with a tractable surrogate combinatorial optimization oracle, so as to exploit the distribution of problem instances instead of solving each instance independently. A core obstacle is that the empirical risk is then piecewise constant in the model parameters. This hinders gradient-based optimization and only few theoretical guarantees have been provided so far. We address this issue by analyzing smoothed (perturbed) policies: adding controlled random perturbations to the direction used by the linear oracle yields a differentiable surrogate risk and improves generalization. Our main contribution is a generalization bound that decomposes the excess risk into perturbation bias, statistical estimation error, and optimization error. The analysis hinges on a new Uniform Weak (UW) property capturing the geometric interaction between the statistical model and the normal fan of the feasible polytope; we show it holds under mild assumptions, and automatically when a minimal baseline perturbation is present. The framework covers, in particular, contextual stochastic optimization. We illustrate the scope of the results on applications such as stochastic vehicle scheduling, highlighting how smoothing enables both tractable training and controlled generalization.
翻译:近期一系列结构化学习方法在具有复杂、应用特定目标的组合优化问题上推进了实用前沿。这些方法学习将统计模型与易处理的替代组合优化预言机相耦合的策略,从而利用问题实例的分布而非独立求解每个实例。一个核心障碍在于,此时经验风险在模型参数上是分段常数函数。这阻碍了基于梯度的优化,且目前仅提供了少量理论保证。我们通过分析平滑化(扰动)策略来解决这一问题:在线性预言机使用的方向上添加受控随机扰动可产生可微的替代风险并改善泛化性能。我们的主要贡献是一个将超额风险分解为扰动偏差、统计估计误差和优化误差的泛化界。该分析依赖于一个新的均匀弱(UW)性质,该性质捕捉了统计模型与可行多面体法扇之间的几何交互作用;我们证明其在温和假设下成立,并在存在最小基线扰动时自动满足。该框架特别涵盖了上下文随机优化。我们通过随机车辆调度等应用说明结果的适用范围,重点阐述平滑化如何同时实现可处理的训练和可控的泛化。