Budget management strategies in repeated auctions have received growing attention in online advertising markets. However, previous work on budget management in online bidding mainly focused on second-price auctions. The rapid shift from second-price auctions to first-price auctions for online ads in recent years has motivated the challenging question of how to bid in repeated first-price auctions while controlling budgets. In this work, we study the problem of learning in repeated first-price auctions with budgets. We design a dual-based algorithm that can achieve a near-optimal $\widetilde{O}(\sqrt{T})$ regret with full information feedback where the maximum competing bid is always revealed after each auction. We further consider the setting with one-sided information feedback where only the winning bid is revealed after each auction. We show that our modified algorithm can still achieve an $\widetilde{O}(\sqrt{T})$ regret with mild assumptions on the bidder's value distribution. Finally, we complement the theoretical results with numerical experiments to confirm the effectiveness of our budget management policy.
翻译:重复拍卖中的预算管理策略在在线广告市场日益受到关注。然而,以往关于在线投标预算管理的研究主要聚焦于第二价格拍卖。近年来,在线广告从第二价格拍卖向第一价格拍卖的快速转变,引发了如何在控制预算的同时参与重复第一价格拍卖这一挑战性问题。本文研究了带预算的重复第一价格拍卖中的学习问题。我们设计了一种基于对偶的算法,在完全信息反馈(即每次拍卖后始终揭示最高竞争出价)下,能够实现近最优的$\widetilde{O}(\sqrt{T})$遗憾值。我们进一步考虑了仅揭示中标价格的单边信息反馈场景,并证明在投标者价值分布的温和假设下,改进后的算法仍能达到$\widetilde{O}(\sqrt{T})$遗憾值。最后,我们通过数值实验验证了预算管理策略的有效性,为理论分析提供了补充。