Differentiable Bilevel Programming for Stackelberg Congestion Games

In a Stackelberg congestion game (SCG), a leader aims to maximize their own gain by anticipating and manipulating the equilibrium state at which the followers settle by playing a congestion game. Often formulated as bilevel programs, large-scale SCGs are well known for their intractability and complexity. Here, we attempt to tackle this computational challenge by marrying traditional methodologies with the latest differentiable programming techniques in machine learning. The core idea centers on replacing the lower-level equilibrium problem with a smooth evolution trajectory defined by the imitative logit dynamic (ILD), which we prove converges to the equilibrium of the congestion game under mild conditions. Building upon this theoretical foundation, we propose two new local search algorithms for SCGs. The first is a gradient descent algorithm that obtains the derivatives by unrolling ILD via differentiable programming. Thanks to the smoothness of ILD, the algorithm promises both efficiency and scalability. The second algorithm adds a heuristic twist by cutting short the followers' evolution trajectory. Behaviorally, this means that, instead of anticipating the followers' best response at equilibrium, the leader seeks to approximate that response by only looking ahead a limited number of steps. Our numerical experiments are carried out over various instances of classic SCG applications, ranging from toy benchmarks to large-scale real-world examples. The results show the proposed algorithms are reliable and scalable local solvers that deliver high-quality solutions with greater regularity and significantly less computational effort compared to the many incumbents included in our study.

翻译：在斯塔克尔伯格拥堵博弈（SCG）中，领导者通过预测并操控追随者在拥堵博弈中形成的均衡状态来最大化自身收益。该问题通常被建模为双层规划，而大规模SCG因其难以处理且复杂度高而著称。本文尝试将传统方法论与机器学习领域最新的可微编程技术相结合，以应对这一计算挑战。核心思想在于用模仿对数动力学（ILD）定义的平滑演化轨迹替代下层均衡问题，我们证明了该动力学在温和条件下收敛至拥堵博弈的均衡。基于这一理论基础，我们提出了两种面向SCG的局部搜索算法：第一种为梯度下降算法，通过可微编程展开ILD来获取导数；得益于ILD的平滑性，该算法兼具高效性与可扩展性。第二种算法引入启发式策略，通过缩短追随者的演化轨迹来实现。行为学意义上，这意味着领导者不再预测追随者在均衡状态下的最优反应，而是通过仅向前展望有限步数来近似逼近该反应。我们在从玩具基准到大规模真实案例的经典SCG应用实例上进行了数值实验。结果表明，与研究中纳入的诸多现有方法相比，本文提出的算法可作为可靠且可扩展的局部求解器，能以更高的规律性和显著更少的计算代价提供高质量解。