We study a sequential mechanism design problem in which a principal seeks to elicit truthful reports from multiple rational agents while starting with no prior knowledge of agents' beliefs. We introduce Distributionally Robust Adaptive Mechanism (DRAM), a general framework combining insights from both mechanism design and online learning to jointly address truthfulness and cost-optimality. Throughout the sequential game, the mechanism estimates agents' beliefs and iteratively updates a distributionally robust linear program with shrinking ambiguity sets to reduce payments while preserving truthfulness. Our mechanism guarantees truthful reporting with high probability while achieving $\tilde{O}(\sqrt{T})$ cumulative regret, and we establish a matching lower bound showing that no feasible adaptive mechanism can asymptotically do better. The framework generalizes to plug-in estimators, supporting structured priors and delayed feedback. To our knowledge, this is the first adaptive mechanism under general settings that maintains truthfulness and achieves optimal regret when incentive constraints are unknown and must be learned.
翻译:我们研究一个序贯机制设计问题,其中委托人旨在从多个理性智能体处获取真实报告,但初始时对智能体的信念一无所知。我们提出分布鲁棒自适应机制(DRAM),这是一个结合机制设计与在线学习洞见的通用框架,可同时处理真实性与成本最优性问题。在整个序贯博弈过程中,该机制估计智能体的信念,并迭代更新具有收缩模糊集的分布鲁棒线性规划,以在保持真实性的同时减少支付。我们的机制在实现$\tilde{O}(\sqrt{T})$累积遗憾的同时,能以高概率保证报告的真实性;我们还建立了匹配的遗憾下界,证明任何可行的自适应机制都无法从渐近角度获得更优表现。该框架可推广至插件式估计器,支持结构化先验与延迟反馈。据我们所知,这是首个在一般设定下维持真实性、且在激励约束未知且需学习时实现最优遗憾的自适应机制。