MenuNet: A Strategy-Proof Mechanism for Matching Markets

Strategy-proofness is a fundamental desideratum in mechanism design, ensuring truthful reporting and robust participation. Stability is another central requirement in matching markets, widely adopted in applications such as school choice and labor market clearing. In practice, however, these markets are invariably governed by complex distributional constraints, ranging from diversity quotas and regional balance to global capacity slacks, under which stable matchings often fail to exist. This raises a fundamental question: how to distribute unavoidable instability across agents while preserving strategy-proofness? To address this, we propose \texttt{MenuNet}, a strategy-proof mechanism design framework based on a neural representation of menus. Rather than directly constructing assignments, \texttt{MenuNet} learns to generate personalized probabilistic menus, from which assignments are realized via a structured sequential choice rule that guarantees strategy-proofness by construction. By decomposing stability into fairness (no envy) and non-wastefulness, our approach models these properties as vector-valued quantities and optimizes their distribution through differentiable objectives, providing a principled trade-off between competing axioms. Empirically, \texttt{MenuNet} navigates this trade-off effectively: it consistently outperforms Random Serial Dictatorship (RSD) in terms of envy and Deferred Acceptance (DA) in terms of waste, while maintaining scalability and computational efficiency. These results suggest that learning-based menu mechanisms provide a flexible and scalable paradigm for mechanism design in highly constrained, real-world environments.

翻译：防策略性是机制设计中的基本期望属性，可确保真实报告与稳健参与。稳定性是匹配市场的另一核心要求，广泛适用于学校选择、劳动力市场出清等场景。然而在实践中，这些市场始终受制于复杂的分配约束（涵盖多样性配额、区域平衡及全局容量松弛），在此类约束下稳定匹配往往无法存在。这引出一个根本性问题：如何在维持防策略性的同时，将不可避免的不稳定性在主体间进行分配？为此，我们提出基于菜单神经表示的防策略性机制设计框架 \texttt{MenuNet}。\texttt{MenuNet} 不直接构建分配结果，而是通过学习生成个性化概率菜单，并通过确保防策略性的结构化序贯选择规则实现分配。通过将稳定性分解为公平性（无嫉妒）与无浪费性，该方法将两类属性建模为向量值量，并通过可微目标优化其分布，从而在竞争性公理间实现原则性权衡。实验表明，\texttt{MenuNet} 能有效驾驭这种权衡：其在嫉妒性指标上持续优于随机序列独裁机制（RSD），在浪费性指标上优于延迟接受算法（DA），同时保持可扩展性与计算效率。这些结果表明，基于学习的菜单机制为高度约束的实际环境中的机制设计提供了灵活且可扩展的范式。