IID Prophet Inequality with Random Horizon: Going Beyond Increasing Hazard Rates

Prophet inequalities are a central object of study in optimal stopping theory. In the iid model, a gambler sees values in an online fashion, sampled independently from a given distribution. Upon observing each value, the gambler either accepts it as a reward or irrevocably rejects it and proceeds to observe the next value. The goal of the gambler, who cannot see the future, is maximising the expected value of the reward while competing against the expectation of a prophet (the offline maximum). In other words, one seeks to maximise the gambler-to-prophet ratio of the expectations. This model has been studied with infinite, finite and unknown number of values. When the gambler faces a random number of values, the model is said to have random horizon. We consider the model in which the gambler is given a priori knowledge of the horizon's distribution. Alijani et al. (2020) designed a single-threshold algorithms achieving a ratio of $1/2$ when the random horizon has an increasing hazard rate and is independent of the values. We prove that with a single-threshold, a ratio of $1/2$ is actually achievable for several larger classes of horizon distributions, with the largest being known as the $\mathcal{G}$ class in reliability theory. Moreover, we extend this result to its dual, the $\overline{\mathcal{G}}$ class (which includes the decreasing hazard rate class), and to low-variance horizons. Finally, we construct the first example of a family of horizons, for which multiple thresholds are necessary to achieve a nonzero ratio. We establish that the Secretary Problem optimal stopping rule provides one such algorithm, paving the way towards the study of the model beyond single-threshold algorithms.

翻译：先知不等式是最优停止理论的核心研究对象。在独立同分布模型中，博弈者以在线方式观测从给定分布中独立采样的数值序列。每观测到一个数值，博弈者可以选择接受该值作为收益，或不可撤销地拒绝该值并继续观测下一个数值。无法预知未来的博弈者，其目标在于最大化收益的期望值，并与先知（离线最大值）的期望值进行竞争。换言之，研究者致力于最大化博弈者与先知期望值的比率。该模型已在无限、有限及未知数值数量的情境下得到广泛研究。当博弈者面对随机数量的数值时，该模型被称为具有随机时域。本文研究博弈者预先获知时域分布信息的模型。Alijani等人（2020）设计了一种单阈值算法，在随机时域具有递增失效率且与数值独立时，该算法能达到$1/2$的竞争比。我们证明，对于更广泛的时域分布类别，单阈值算法实际上仍可实现$1/2$的竞争比，其中最大的分布类别在可靠性理论中被称为$\mathcal{G}$类。此外，我们将该结果扩展至其对偶类别$\overline{\mathcal{G}}$类（包含递减失效率类）以及低方差时域。最后，我们首次构建了一族时域分布的实例，证明必须使用多阈值算法才能获得非零竞争比。我们证实秘书问题的最优停止规则可提供此类算法，从而为超越单阈值算法的模型研究开辟了新路径。