We study Stackelberg (leader--follower) tuning of network parameters (tolls, capacities, incentives) in combinatorial congestion games, where selfish users choose discrete routes (or other combinatorial strategies) and settle at a congestion equilibrium. The leader minimizes a system-level objective (e.g., total travel time) evaluated at equilibrium, but this objective is typically nonsmooth because the set of used strategies can change abruptly. We propose ZO-Stackelberg, which couples a projection-free Frank--Wolfe equilibrium solver with a zeroth-order outer update, avoiding differentiation through equilibria. We prove convergence to generalized Goldstein stationary points of the true equilibrium objective, with explicit dependence on the equilibrium approximation error, and analyze subsampled oracles: if an exact minimizer is sampled with probability $κ_m$, then the Frank--Wolfe error decays as $\mathcal{O}(1/(κ_m T))$. We also propose stratified sampling as a practical way to avoid a vanishing $κ_m$ when the strategies that matter most for the Wardrop equilibrium concentrate in a few dominant combinatorial classes (e.g., short paths). Experiments on real-world networks demonstrate that our method achieves orders-of-magnitude speedups over a differentiation-based baseline while converging to follower equilibria.
翻译:本研究探讨组合拥塞博弈中网络参数(通行费、容量、激励)的斯塔克尔伯格(领导者-追随者)调节问题。在该博弈中,自私用户选择离散路径(或其他组合策略)并最终达到拥塞均衡。领导者旨在最小化均衡状态下的系统级目标(例如总行程时间),但该目标通常是非光滑的,因为被使用策略的集合可能发生突变。我们提出ZO-Stackelberg方法,该方法将无投影的Frank-Wolfe均衡求解器与零阶外部更新相结合,避免了通过均衡点进行微分。我们证明了该方法能收敛到真实均衡目标的广义Goldstein稳定点,并明确了均衡近似误差的显式依赖关系,同时分析了子采样预言机:若以概率$κ_m$采样到精确最小化器,则Frank-Wolfe误差以$\mathcal{O}(1/(κ_m T))$速率衰减。针对Wardrop均衡中关键策略集中于少数主导组合类别(例如短路径)的情况,我们提出分层采样作为避免$κ_m$趋近于零的实用方法。在现实网络上的实验表明,相较于基于微分的方法基准,我们的方法在收敛到追随者均衡的同时实现了数量级的加速。