Autonomous artificial intelligence agents in negotiation systems must generate equitable utility allocations satisfying individual rationality (IR), ensuring each agent receives at least its outside option, and the Nash Bargaining Solution (NBS), which maximizes joint surplus. Existing generative models often learn suboptimal human behaviors, producing solutions far from Pareto efficiency, while classical methods require full Pareto frontier knowledge, which is unavailable in real datasets. We propose a guided graph diffusion framework that generates individually rational utility vectors while approximating the NBS without frontier knowledge at inference time. Negotiations are modeled as directed graphs with graph attention capturing asymmetric agent attributes, and a conditional diffusion model maps these to utility vectors. A differentiable composite guidance loss, applied in the final reverse diffusion steps, penalizes IR violations and Nash product gaps. We prove that, under sufficient penalty weighting, solutions enter the IR region in finite time. Across datasets, the method achieves 100% IR compliance. Nash efficiency reaches 99.45% on synthetic data (within 0.55 percentage points of an oracle), and 54.24% (CaSiNo) and 88.67% (Deal or No Deal), improving 20-60 percentage points over unconstrained generative baselines.
翻译:自主人工智能代理在谈判系统中必须生成满足个体理性(IR)的公平效用分配,确保每个代理至少获得其外部选项收益,并满足纳什议价解(NBS)以最大化联合盈余。现有生成式模型通常学习次优的人类行为,产生远离帕累托效率的解,而经典方法需要完整的帕累托前沿知识,这在真实数据集中不可得。我们提出一种引导图扩散框架,在推理时无需前沿知识即可生成满足个体理性的效用向量,同时近似纳什议价解。谈判被建模为有向图,通过图注意力捕获非对称代理属性,并利用条件扩散模型将其映射至效用向量。在最终逆扩散步骤中应用可微复合引导损失,对个体理性违反和纳什乘积缺口施加惩罚。我们证明,在充分惩罚权重下,解能在有限时间内进入个体理性区域。在多个数据集上,该方法实现了100%的个体理性合规性:合成数据上的纳什效率达99.45%(与最优解相差0.55个百分点),CaSiNo数据集和Deal or No Deal数据集分别达54.24%和88.67%,较无约束生成基线提升20-60个百分点。