Prophet inequalities are a central object of study in optimal stopping theory. A gambler is sent values online, sampled from an instance of independent distributions, in an adversarial, random or selected order, depending on the model. When observing each value, the gambler either accepts it as a reward or irrevocably rejects it and proceeds to observe the next value. The goal of the gambler, who cannot see the future, is maximising the expected value of the reward while competing against the expectation of a prophet (the offline maximum). In other words, one seeks to maximise the gambler-to-prophet ratio of the expectations. The model, in which the gambler selects the arrival order first, and then observes the values, is known as Order Selection. Recently it has been shown that in this model a ratio of $0.7251$ can be attained for any instance. If the gambler chooses the arrival order (uniformly) at random, we obtain the Random Order model. The worst case ratio over all possible instances has been extensively studied for at least $40$ years. Still, it is not known if carefully choosing the order, or simply taking it at random, benefits the gambler. We prove that, in the Random Order model, no algorithm can achieve a ratio larger than $0.7235$, thus showing for the first time that there is a real benefit in choosing the order.
翻译:先知不等式是最优停止理论中的核心研究对象。一位赌徒在线收到来自独立分布实例的值,这些值根据模型的不同,以对抗性、随机或选定的顺序呈现。在观察每个值时,赌徒要么将其接受为奖励,要么不可撤销地拒绝它并继续观察下一个值。赌徒无法预见未来,其目标是在与先知(离线最大值)的期望值竞争的同时,最大化奖励的期望值。换言之,即寻求最大化赌徒与先知期望值的比率。赌徒首先选择到达顺序,然后观察值的模型被称为顺序选择。最近研究表明,在此模型中,对于任何实例均可达到 $0.7251$ 的比率。若赌徒随机(均匀地)选择到达顺序,则得到随机顺序模型。对所有可能实例的最坏情况比率已被广泛研究了至少 $40$ 年。然而,尚不清楚精心选择顺序还是仅随机选择顺序是否对赌徒有利。我们证明,在随机顺序模型中,没有任何算法能达到超过 $0.7235$ 的比率,从而首次表明选择顺序确实存在实际益处。