Modern open and softwarized systems -- such as O-RAN telecom networks and cloud computing platforms -- host independently developed applications with distinct, and potentially conflicting, objectives. Coordinating the behavior of such applications to ensure stable system operation poses significant challenges, especially when each application's utility is accessible only via costly, black-box evaluations. In this paper, we consider a centralized optimization framework in which a system controller suggests joint configurations to multiple strategic players, representing different applications, with the goal of aligning their incentives toward a stable outcome. This interaction is modeled as a learned optimization with an equilibrium constraint in which the central optimizer learns the utility functions through sequential, multi-fidelity evaluations with the goal of identifying a pure Nash equilibrium (PNE). To address this challenge, we propose MF-UCB-PNE, a novel multi-fidelity Bayesian optimization strategy that leverages a budget-constrained sampling process to approximate PNE solutions. MF-UCB-PNE systematically balances exploration across low-cost approximations with high-fidelity exploitation steps, enabling efficient convergence to incentive-compatible configurations. We provide theoretical and empirical insights into the trade-offs between query cost and equilibrium accuracy, demonstrating the effectiveness of MF-UCB-PNE in identifying effective equilibrium solutions under limited cost budgets.
翻译:现代开放与软件化系统——例如O-RAN电信网络与云计算平台——承载着独立开发且具有不同(甚至可能相互冲突)目标的应用程序。协调此类应用程序的行为以确保系统稳定运行面临重大挑战,尤其当每个应用程序的效用函数仅能通过高成本的黑盒评估获取时。本文提出一种集中式优化框架,其中系统控制器向代表不同应用程序的多个策略参与者建议联合配置,旨在将其激励导向稳定结果。该交互过程被建模为带均衡约束的学习优化问题:中央优化器通过序贯多保真度评估学习效用函数,以识别纯纳什均衡。针对此挑战,我们提出MF-UCB-PNE——一种新颖的多保真度贝叶斯优化策略,该策略利用预算约束的采样过程逼近PNE解。MF-UCB-PNE系统性地平衡低成本近似探索与高保真度开发步骤,从而实现向激励相容配置的高效收敛。我们通过理论与实证研究揭示了查询成本与均衡精度间的权衡关系,证明了MF-UCB-PNE在有限成本预算下识别有效均衡解的能力。