IID Prophet Inequality with Random Horizon: Going Beyond Increasing Hazard Rates

Prophet inequalities are a central object of study in optimal stopping theory. In the iid model, a gambler sees values in an online fashion, sampled independently from a given distribution. Upon observing each value, the gambler either accepts it as a reward or irrevocably rejects it and proceeds to observe the next value. The goal of the gambler, who cannot see the future, is maximising the expected value of the reward while competing against the expectation of a prophet (the offline maximum). In other words, one seeks to maximise the gambler-to-prophet ratio of the expectations. This model has been studied with infinite, finite and unknown number of values. When the gambler faces a random number of values, the model is said to have random horizon. We consider the model in which the gambler is given a priori knowledge of the horizon's distribution. Alijani et al. (2020) designed a single-threshold algorithms achieving a ratio of $1/2$ when the random horizon has an increasing hazard rate and is independent of the values. We prove that with a single-threshold, a ratio of $1/2$ is actually achievable for several larger classes of horizon distributions, with the largest being known as the $\mathcal{G}$ class in reliability theory. Moreover, we extend this result to its dual, the $\overline{\mathcal{G}}$ class (which includes the decreasing hazard rate class), and to low-variance horizons. Finally, we construct the first example of a family of horizons, for which multiple thresholds are necessary to achieve a nonzero ratio. We establish that the Secretary Problem optimal stopping rule provides one such algorithm, paving the way towards the study of the model beyond single-threshold algorithms.

翻译：先知不等式是最优停止理论中的核心研究对象。在独立同分布模型中，赌徒以在线方式观察从给定分布中独立采样的数值序列。每观察到一个数值，赌徒可以选择接受该值作为奖励，或不可撤销地拒绝并继续观察下一个数值。无法预知未来的赌徒，其目标在于最大化奖励的期望值，同时与先知（离线最大值）的期望值竞争。换言之，研究者试图最大化赌徒与先知期望值的比率。该模型已在数值数量为无限、有限及未知的情况下得到研究。当赌徒面对随机数量的数值时，该模型被称为具有随机时域。本文考虑赌徒预先获知时域分布信息的模型。Alijani等人（2020）设计了一种单阈值算法，在随机时域具有递增失效率且与数值独立时，可实现$1/2$的竞争比。我们证明，对于更广泛的时域分布类别，单阈值算法实际上仍可实现$1/2$的竞争比，其中最大的类别在可靠性理论中被称为$\mathcal{G}$类。此外，我们将该结果扩展至其对偶类$\overline{\mathcal{G}}$类（包含递减失效率类）以及低方差时域。最后，我们首次构建了一族时域分布的实例，证明必须使用多阈值算法才能获得非零竞争比。我们证实秘书问题的最优停止规则可提供此类算法，从而为超越单阈值算法的模型研究开辟了新路径。