Iterative combinatorial auctions are widely used in high stakes settings such as spectrum auctions. Such auctions can be hard to understand analytically, making it difficult for bidders to determine how to behave and for designers to optimize auction rules to ensure desirable outcomes such as high revenue or welfare. In this paper, we investigate whether multi-agent reinforcement learning (MARL) algorithms can be used to understand iterative combinatorial auctions, given that these algorithms have recently shown empirical success in several other domains. We find that MARL can indeed benefit auction analysis, but that deploying it effectively is nontrivial. We begin by describing modelling decisions that keep the resulting game tractable without sacrificing important features such as imperfect information or asymmetry between bidders. We also discuss how to navigate pitfalls of various MARL algorithms, how to overcome challenges in verifying convergence, and how to generate and interpret multiple equilibria. We illustrate the promise of our resulting approach by using it to evaluate a specific rule change to a clock auction, finding substantially different auction outcomes due to complex changes in bidders' behavior.
翻译:迭代组合拍卖广泛应用于频谱拍卖等高价值场景。此类拍卖在分析层面难以理解,导致竞标者难以确定最优行为策略,拍卖设计者也难以优化规则以确保高收益或高福利等理想结果。本文探究多智能体强化学习算法是否可用于理解迭代组合拍卖——鉴于该算法近期在多个领域取得实证成功。我们发现多智能体强化学习确实有助于拍卖分析,但有效部署该算法并非易事。我们首先阐述建模决策方案,这些决策既保证了博弈的可处理性,又保留了不完全信息或竞标者非对称性等重要特征。同时讨论如何规避各类多智能体强化学习算法的陷阱、克服验证收敛性的挑战,以及生成和解读多重均衡的方法。通过评估时钟拍卖中某项具体规则变更,我们展示了该方法的潜力——由于竞标者行为的复杂变化,拍卖结果呈现显著差异。