Differentiable Normative Guidance for Nash Bargaining Solution Recovery

Autonomous artificial intelligence agents in negotiation systems must generate equitable utility allocations satisfying individual rationality (IR), ensuring each agent receives at least its outside option, and the Nash Bargaining Solution (NBS), which maximizes joint surplus. Existing generative models often learn suboptimal human behaviors, producing solutions far from Pareto efficiency, while classical methods require full Pareto frontier knowledge, which is unavailable in real datasets. We propose a guided graph diffusion framework that generates individually rational utility vectors while approximating the NBS without frontier knowledge at inference time. Negotiations are modeled as directed graphs with graph attention capturing asymmetric agent attributes, and a conditional diffusion model maps these to utility vectors. A differentiable composite guidance loss, applied in the final reverse diffusion steps, penalizes IR violations and Nash product gaps. We prove that, under sufficient penalty weighting, solutions enter the IR region in finite time. Across datasets, the method achieves 100% IR compliance. Nash efficiency reaches 99.45% on synthetic data (within 0.55 percentage points of an oracle), and 54.24% (CaSiNo) and 88.67% (Deal or No Deal), improving 20-60 percentage points over unconstrained generative baselines.

翻译：自主人工智能代理在谈判系统中必须生成满足个体理性（IR）的公平效用分配，确保每个代理至少获得其外部选项收益，并满足纳什议价解（NBS）以最大化联合盈余。现有生成式模型通常学习次优的人类行为，产生远离帕累托效率的解，而经典方法需要完整的帕累托前沿知识，这在真实数据集中不可得。我们提出一种引导图扩散框架，在推理时无需前沿知识即可生成满足个体理性的效用向量，同时近似纳什议价解。谈判被建模为有向图，通过图注意力捕获非对称代理属性，并利用条件扩散模型将其映射至效用向量。在最终逆扩散步骤中应用可微复合引导损失，对个体理性违反和纳什乘积缺口施加惩罚。我们证明，在充分惩罚权重下，解能在有限时间内进入个体理性区域。在多个数据集上，该方法实现了100%的个体理性合规性：合成数据上的纳什效率达99.45%（与最优解相差0.55个百分点），CaSiNo数据集和Deal or No Deal数据集分别达54.24%和88.67%，较无约束生成基线提升20-60个百分点。

相关内容

数据集

关注 88

数据集，又称为资料集、数据集合或资料集合，是一种由数据所组成的集合。
Data set（或dataset）是一个数据的集合，通常以表格形式出现。每一列代表一个特定变量。每一行都对应于某一成员的数据集的问题。它列出的价值观为每一个变量，如身高和体重的一个物体或价值的随机数。每个数值被称为数据资料。对应于行数，该数据集的数据可能包括一个或多个成员。

[ICML 2026] SOL：让大模型把算力花在关键Token上：自优化语言模型

专知会员服务

7+阅读 · 5月12日

《革命性软件智能：融合神经程序合成、量子安全运维与可解释人工智能的下一代自主系统统一框架》最新报告

专知会员服务

25+阅读 · 2025年8月28日

中文版 | 生成式人工智能（GenAI）：概览、议题与美国国会考量

专知会员服务

23+阅读 · 2025年4月15日

CVPR 2024｜NAT其实真的不输扩散模型！AutoNAT：全新定制训练&生成策略拓宽性能边界

专知会员服务

20+阅读 · 2024年9月4日