Automated bidding to optimize online advertising with various constraints, e.g. ROI constraints and budget constraints, is widely adopted by advertisers. A key challenge lies in designing algorithms for non-truthful mechanisms with ROI constraints. While prior work has addressed truthful auctions or non-truthful auctions with weaker benchmarks, this paper provides a significant improvement: We develop online bidding algorithms for repeated first-price auctions with ROI constraints, benchmarking against the optimal randomized strategy in hindsight. In the full feedback setting, where the maximum competing bid is observed, our algorithm achieves a near-optimal $\widetilde{O}(\sqrt{T})$ regret bound, and in the bandit feedback setting (where the bidder only observes whether the bidder wins each auction), our algorithm attains $\widetilde{O}(T^{3/4})$ regret bound.
翻译:在在线广告投放中,为满足多种约束(如投资回报率约束与预算约束)而采用自动竞价优化策略已被广告主广泛采纳。一个关键挑战在于为具有投资回报率约束的非真实机制设计算法。尽管先前研究已针对真实拍卖或采用较弱基准的非真实拍卖提出了解决方案,但本文取得了显著进展:我们针对具有投资回报率约束的重复一价拍卖场景,开发了在线竞价算法,并以事后最优随机策略作为基准进行性能评估。在完全反馈场景(可观测最高竞争出价)中,我们的算法实现了近乎最优的 $\widetilde{O}(\sqrt{T})$ 遗憾界;而在赌博机反馈场景(竞价者仅能观测每场拍卖是否获胜)中,我们的算法获得了 $\widetilde{O}(T^{3/4})$ 遗憾界。