This paper investigates a class of games with large strategy spaces, motivated by challenges in AI alignment and language games. We introduce the hidden game problem, where for each player, an unknown subset of strategies consistently yields higher rewards compared to the rest. The central question is whether efficient regret minimization algorithms can be designed to discover and exploit such hidden structures, leading to equilibrium in these subgames while maintaining rationality in general. We answer this question affirmatively by developing a composition of regret minimization techniques that achieve optimal external and swap regret bounds. Our approach ensures rapid convergence to correlated equilibria in hidden subgames, leveraging the hidden game structure for improved computational efficiency.
翻译:本文研究了一类具有大规模策略空间的博弈问题,其研究动机源于人工智能对齐与语言博弈中的挑战。我们提出隐式博弈问题:对每位参与者而言,存在一个未知的策略子集,该子集中的策略始终能产生高于其他策略的收益。核心问题在于:能否设计高效的遗憾最小化算法来发现并利用此类隐式结构,从而在这些子博弈中达成均衡,同时保持全局理性?我们通过构建复合型遗憾最小化技术给出了肯定答案,该技术能够达到最优的外部遗憾与交换遗憾界。我们的方法利用隐式博弈结构提升计算效率,确保在隐式子博弈中快速收敛至相关均衡。