In this paper, we introduce a family of sequential decision-making problems, collectively termed the Keychain Problem, that involve exploring a set of actions to maximize expected payoff when only a subset of actions are available in each stage. In an instance of the Keychain Problem, a locksmith faces a sequence of decisions, each of which involves selecting one key from a keychain (a subset of keys) to attempt to open a lock. Given a Bayesian prior on the effectiveness of keys, the locksmith's goal is to minimize the opportunity cost, which is the expected number of rounds in which the chain has a correct key but our selected key is incorrect. We study the computation of the Bayes optimal solution for Keychain Problems. Employing polynomial-time reductions, we establish formal connections between natural variants of the Keychain Problem and well-studied algorithmic economics problems on bipartite graphs. When the keychain order is known to the locksmith, we show that it reduces to Maximum Weight Bipartite Matching (MWBM). More general is the situation when the keychain order is sampled from a prior distribution (possibly correlated with the correct key). Here the Keychain Problem reduces to a novel generalization of MWBM which we coin the Maximum Weight Laminar Matching, which then further reduces to combinatorial auctions under XOS valuation functions. Finally, we show that when the locksmith can choose the keychain order, the Keychain problem reduces from a classic NP-hard combinatorial problem, again, on bipartite graphs. Besides implying algorithmic results and deepening our structural understanding about the Keychain Problem, our established reductions also find applications beyond -- for example, to the Philosopher Inequality for online bipartite matching.
翻译:本文引入一类序列决策问题,统称为钥匙链问题,其核心在于探索一组行动以最大化期望收益,而每个决策阶段仅能使用行动的一个子集。在钥匙链问题的一个实例中,锁匠面临一系列决策,每次需从钥匙链(钥匙的一个子集)中选择一把钥匙来尝试开锁。给定钥匙有效性的贝叶斯先验,锁匠的目标是最小化机会成本,即预期在钥匙链包含正确钥匙但所选钥匙错误的轮数。我们研究了钥匙链问题贝叶斯最优解的计算方法。通过多项式时间归约,我们在钥匙链问题的自然变体与二部图上深入研究的算法经济学问题之间建立了形式化联系。当钥匙链顺序对锁匠已知时,我们证明该问题可归约为最大权二部匹配问题。更一般的情况是钥匙链顺序从先验分布中采样(可能与正确钥匙相关)。此时钥匙链问题归约为一种我们称为最大权层状匹配的新泛化问题,该问题可进一步归约为XOS估值函数下的组合拍卖问题。最后,我们证明当锁匠能选择钥匙链顺序时,钥匙链问题可从一个经典的NP难组合问题归约而来,该问题同样基于二部图。这些归约不仅推导出算法结果并深化了对钥匙链问题结构的理解,还拓展了应用场景——例如在线二部匹配中的哲学家不等式。