We study variants of the secretary problem, where $N$, the number of candidates, is a random variable, and the decision maker wants to maximize the probability of success -- picking the largest number among the $N$ candidates -- using only the relative ranks of the candidates revealed so far. We consider three forms of prior information about $\mathbf p$, the probability distribution of $N$. In the full information setting, we assume $\mathbf p$ to be fully known. In that case, we show that single-threshold type of strategies can achieve $1/e$-approximation to the maximum probability of success among all possible strategies. In the upper bound setting, we assume that $N\leq \overline{n}$ (or $\mathbb E[N]\leq \bar{\mu}$), where $\bar{n}$ (or $\bar{\mu}$) is known. In that case, we show that randomization over single-threshold type of strategies can achieve the optimal worst case probability of success of $\frac{1}{\log(\bar{n})}$ (or $\frac{1}{\log(\bar{\mu})}$) asymptotically. Surprisingly, there is a single-threshold strategy (depending on $\overline{n}$) that can succeed with probability $2/e^2$ for all but an exponentially small fraction of distributions supported on $[\bar{n}]$. In the sampling setting, we assume that we have access to $m$ samples $N^{(1)},\ldots,N^{(m)}\sim_{iid} \mathbf p$. In that case, we show that if $N\leq T$ with probability at least $1-O(\epsilon)$ for some $T\in \mathbb N$, $m\gtrsim \frac{1}{\epsilon^2}\max(\log(\frac{1}{\epsilon}),\epsilon \log(\frac{\log(T)}{\epsilon}))$ is enough to learn a strategy that is at least $\epsilon$-suboptimal, and we provide a lower bound of $\Omega(\frac{1}{\epsilon^2})$, showing that the sampling algorithm is optimal when $\epsilon=O(\frac{1}{\log\log(T)})$.
翻译:我们研究秘书问题的变体,其中候选人数量N为随机变量,决策者仅依据已揭示候选人的相对排名,目标是最大化成功概率——即从N名候选人中选出最优者。我们考虑关于N的概率分布p的三种先验信息形式。在完全信息设定中,假设p完全已知。此时我们证明,单阈值类策略可在所有可能策略中实现最大成功概率的1/e近似。在上界设定中,假设N≤n̄(或E[N]≤μ̄),其中n̄(或μ̄)已知。此时我们证明,单阈值类策略的随机化可渐近实现最优最坏情况成功概率1/log(n̄)(或1/log(μ̄))。令人惊讶的是,存在一种依赖于n̄的单阈值策略,对支撑集在[n̄]内的所有分布(除去指数小比例分布)都能以2/e²的概率成功。在采样设定中,假设可访问m个独立同分布样本N^(1),…,N^(m)~p。此时我们证明:若存在T∈N使得N≤T的概率至少为1-O(ε),则m≳(1/ε²)·max(log(1/ε), ε log(log(T)/ε))足以学习ε-次优策略;同时给出Ω(1/ε²)的下界,表明当ε=O(1/log log(T))时采样算法达到最优。