Non-Monetary Mechanism Design without Distributional Information: Using Scarce Audits Wisely

We study a repeated resource allocation problem with strategic agents where monetary transfers are disallowed and the central planner has no prior information on agents' utility distributions. In light of Arrow's impossibility theorem, acquiring information about agent preferences through some form of feedback is necessary. We assume that the central planner can request powerful but expensive audits on the winner in any round, revealing the true utility of the winner in that round. We design a mechanism achieving $T$-independent $O(K^2)$ regret in social welfare while requesting $O(K^3 \log T)$ audits in expectation, where $K$ is the number of agents and $T$ is the number of rounds. We also show an $\Omega(K)$ lower bound on the regret and an $\Omega(1)$ lower bound on the number of audits when having low regret. Algorithmically, we show that incentive-compatibility can be mostly enforced with an accurate estimation of the winning probability of each agent under truthful reporting. To do so, we impose future punishments and introduce a *flagging* component, allowing agents to flag any biased estimate (we show that doing so aligns with individual incentives). On the technical side, without monetary transfers and distributional information, the central planner cannot ensure that truthful reporting is exactly an equilibrium. Instead, we characterize the equilibrium via a reduction to a simpler *auxiliary game*, in which agents cannot strategize until late in the $T$ rounds of the allocation problem. The tools developed therein may be of independent interest for other mechanism design problems in which the revelation principle cannot be readily applied.

翻译：我们研究一个具有策略性代理的重复资源分配问题，其中货币转移被禁止，且中央规划者没有关于代理效用分布的先验信息。鉴于阿罗不可能定理，通过某种形式的反馈获取代理偏好信息是必要的。我们假设中央规划者可以在任意轮次中对获胜者请求强大但昂贵的审计，从而揭示该轮获胜者的真实效用。我们设计了一种机制，在社会福利方面实现了与$T$无关的$O(K^2)$遗憾，同时预期仅需$O(K^3 \log T)$次审计，其中$K$为代理数量，$T$为轮次数。我们还证明了当要求低遗憾时，遗憾存在$\Omega(K)$下界，审计次数存在$\Omega(1)$下界。在算法层面，我们表明通过准确估计每个代理在诚实报告下的获胜概率，可以在很大程度上实现激励相容性。为此，我们施加未来惩罚并引入*标记*组件，允许代理标记任何有偏估计（我们证明这样做符合个体激励）。在技术层面，由于缺乏货币转移和分布信息，中央规划者无法确保诚实报告恰好构成均衡。相反，我们通过归约到一个更简单的*辅助博弈*来刻画均衡，在该辅助博弈中，代理在分配问题的$T$轮后期之前无法进行策略操纵。其中发展的工具可能对无法直接应用显示原理的其他机制设计问题具有独立价值。