We show that in discretised first-price auctions with complete information, if the buyers learn to bid with online gradient ascent, in time-average the outcome is (almost) the efficient outcome of the second-price auction. Our proof rests on two novel innovations in the analysis of online gradient ascent in normal-form games, which may be useful in a wider range of applications. First, we develop a potential-function-based argument for the analysis of gradient ascent in normal-form games, allowing us to deduce that certain strategies will not be played in time-average. We provide sufficient conditions which ensure this argument can be applied iteratively, resulting in a procedure reminiscent of iterative elimination of dominated strategies. Second, we develop a novel class of cubic "candidate potential functions", classifying a family of quadratic strategy modifications on the probability simplex against which online gradient ascent incurs no regret.
翻译:我们证明,在具有完全信息的离散化一阶价格拍卖中,若买家采用在线梯度上升法进行竞价学习,则时间平均下的结果(几乎)等价于二阶价格拍卖的有效结果。本文证明基于两个创新点,这些创新在标准式博弈的在线梯度上升分析中可能具有更广泛的应用价值。首先,我们发展了一种基于势函数的分析方法用于标准式博弈中的梯度上升分析,从而推导出某些策略在时间平均下不会被采用。我们给出了确保该分析能迭代应用的充分条件,形成类似迭代剔除劣策略的过程。其次,我们构建了一类新颖的三次"候选势函数",对概率单纯形上的二次策略修正族进行分类,使得在线梯度上升法对此类修正不产生遗憾。