We study a sequential mechanism design problem in which a principal seeks to elicit truthful reports from multiple rational agents while starting with no prior knowledge of agents' beliefs. We introduce Distributionally Robust Adaptive Mechanism (DRAM), a general framework combining insights from both mechanism design and online learning to jointly address truthfulness and cost-optimality. Throughout the sequential game, the mechanism estimates agents' beliefs and iteratively updates a distributionally robust linear program with shrinking ambiguity sets to reduce payments while preserving truthfulness. Our mechanism guarantees truthful reporting with high probability while achieving $\tilde{O}(\sqrt{T})$ cumulative regret, and we establish a matching lower bound showing that no feasible adaptive mechanism can asymptotically do better. The framework generalizes to plug-in estimators, supporting structured priors and delayed feedback. To our knowledge, this is the first adaptive mechanism under general settings that maintains truthfulness and achieves optimal regret when incentive constraints are unknown and must be learned.
翻译:我们研究了一个序贯机制设计问题,其中委托人希望在没有任何关于智能体信念先验知识的情况下,从多个理性智能体中获取真实报告。我们提出了分布鲁棒自适应机制(DRAM),这是一个结合了机制设计与在线学习见解的通用框架,旨在共同解决真实性与成本最优性问题。在整个序贯博弈过程中,该机制估计智能体的信念,并迭代更新一个具有收缩模糊集的分布鲁棒线性规划,以在保持真实性的同时减少支付。我们的机制以高概率保证真实报告,同时实现$\tilde{O}(\sqrt{T})$的累积遗憾,并且我们建立了一个匹配的下界,表明任何可行的自适应机制都无法渐近地做得更好。该框架可推广到插件估计器,支持结构化先验和延迟反馈。据我们所知,这是首个在一般设置下,当激励约束未知且必须学习时,仍能保持真实性并实现最优遗憾的自适应机制。