Motivated by applications where impatience is pervasive and evaluation times are uncertain, we study a selection model where options may expire at an unknown point in time and evaluation times are stochastic. Initially, the decision-maker (DM) has access to $n$ options with known non-negative values: these options have unknown stochastic evaluation and expiration times with known distributional information, which we assume to be independent. When the DM is free, we can select an available option that occupies the DM for an unknown amount of time and collect its value. The objective is to maximize the expected total value obtained from options selected by the DM. Natural formulations of this problem suffer from the curse of dimensionality. In fact, this problem is NP-hard even in the deterministic case. Hence, we focus on efficiently computable approximation algorithms that can provide high expected reward compared to the optimal expected value. Towards this end, we first provide a compact linear programming (LP) relaxation that gives an upper bound on the expected value obtained by the optimal policy. Then we design a polynomial-time algorithm that is nearly a $(1/2)\cdot (1-1/e)$-approximation to the optimal LP value (so also to the optimal expected value). We next shift our focus to the case of independent and identically distributed (i.i.d.) evaluation times. In this case, we show that the greedy policy that always selects the highest-valued option whenever the DM is free obtains a $1/2$-approximation to the optimal expected value. Our approaches extend effortlessly, and we demonstrate their flexibility by providing approximations to natural extensions of our problem. Finally, we evaluate our LP-based policies and the greedy policy empirically on synthetic and real datasets.
翻译:受评估时间不确定且普遍存在不耐烦情绪的应用场景启发,我们研究了一种选择模型:备选方案可能在未知时间点过期,且评估时间具有随机性。初始时,决策者拥有n个已知非负价值的备选方案,这些方案具有未知的随机评估时间和过期时间,其分布信息已知且假设相互独立。当决策者处于空闲状态时,可以选择一个可用方案(该方案将占用决策者未知时长的时间)并获得其价值。目标是通过决策者选择的方案最大化预期总价值。该问题的自然表述会遭遇维度灾难问题。事实上,即使在确定性情况下,该问题也是NP难问题。因此,我们专注于可高效计算的近似算法,使其在期望收益上能够接近最优期望值。为此,我们首先提出了一个紧凑的线性规划松弛模型,该模型能为最优策略获得的期望值提供上界。随后我们设计了一个多项式时间算法,该算法对最优线性规划值的近似比接近(1/2)·(1-1/e)(对最优期望值亦然)。接着我们将研究重点转向评估时间独立同分布的情况。在此情况下,我们证明当决策者空闲时始终选择最高价值备选方案的贪婪策略,能够获得最优期望值的1/2近似比。我们的方法具有可扩展性,并通过为问题的自然扩展形式提供近似解来展示其灵活性。最后,我们在合成数据集和真实数据集上对基于线性规划的策略及贪婪策略进行了实证评估。