Revenue in First- and Second-Price Display Advertising Auctions: Understanding Markets with Learning Agents

The transition of display ad exchanges from second-price auctions (SPA) to first-price auctions (FPA) has raised questions about its impact on revenue. Auction theory predicts the revenue equivalence between these two auction formats. However, display ad auctions are different from standard models in auction theory. First, automated bidding agents cannot easily derive equilibrium strategies in FPA because information regarding competitors is not readily available. Second, due to principal-agent problems, bidding agents typically maximize return-on-investment (ROI), not payoff. The literature on learning agents for real-time bidding is growing because of the practical relevance of this area; most research has found that learning agents do not converge to an equilibrium. Specifically, research on algorithmic collusion in display ad auctions has argued that FPA can induce symmetric Q-learning agents to tacitly collude, resulting in bids below equilibrium, leading to lower revenue compared to the SPA. Whether bids are in equilibrium cannot easily be determined from field data since the underlying values of bidders are unknown. In this paper, we draw on analytical modeling and numerical experiments and explore the convergence behavior of widespread online learning algorithms in both complete and incomplete information models. Contrary to prior results, we show that there are no systematic deviations from equilibrium behavior. We also explore the differences in revenue of the FPA and SPA, which have not been done for utility functions relevant to this domain, such as ROI. We show that learning algorithms also converge to equilibrium. Still, revenue equivalence does not hold, indicating that collusion may not be the explanation for lower revenue with FPA, and the change in auction format might have had substantial and non-obvious consequences for ad exchanges and advertisers.

翻译：展示广告交易平台从第二价格拍卖（SPA）向第一价格拍卖（FPA）的转变引发了关于其收入影响的疑问。拍卖理论预测这两种拍卖形式具有收入等价性。然而，展示广告拍卖与拍卖理论中的标准模型存在差异。首先，自动化竞价智能体难以在FPA中推导出均衡策略，因为竞争对手的信息不易获取。其次，由于委托-代理问题，竞价智能体通常最大化投资回报率（ROI）而非收益。实时竞价领域学习智能体的相关文献日益增多，这源于该领域的实际重要性；多数研究发现学习智能体不会收敛至均衡状态。具体而言，关于展示广告拍卖中算法合谋的研究指出，FPA可能诱导对称Q学习智能体进行默契合谋，导致出价低于均衡水平，从而使收入低于SPA。由于竞价者的真实估值未知，现场数据难以直接判断出价是否处于均衡状态。本文结合解析建模与数值实验，探究了在完全信息与不完全信息模型中广泛使用的在线学习算法的收敛行为。与先前研究结论不同，我们发现不存在系统性偏离均衡行为的现象。我们还研究了FPA与SPA的收入差异——针对该领域相关效用函数（如ROI）的此类研究尚未开展。研究表明学习算法同样会收敛至均衡状态，但收入等价性并不成立，这意味着合谋可能并非FPA收入较低的原因，拍卖形式的改变可能对广告交易平台和广告主产生了重大且非显而易见的后果。