Cooperative bargaining games are widely used to model resource allocation and conflict resolution. Traditional solutions assume the mediator can access agents utility function values and gradients. However, there is an increasing number of settings, such as human AI interactions, where utility values may be inaccessible or incomparable due to unknown, nonaffine transformations. To model such settings, we consider that the mediator has access only to agents most preferred directions, i.e., normalized utility gradients in the decision space. To this end, we propose a cooperative bargaining algorithm where a mediator has access to only the direction oracle of each agent. We prove that unlike popular approaches such as the Nash and Kalai Smorodinsky bargaining solutions, our approach is invariant to monotonic nonaffine transformations, and that under strong convexity and smoothness assumptions, this approach enjoys global asymptotic convergence to Pareto stationary solutions. Moreover, we show that the bargaining solutions found by our algorithm also satisfy the axioms of symmetry and (under slightly stronger conditions) independence of irrelevant alternatives, which are popular in the literature. Finally, we conduct experiments in two domains, multi agent formation assignment and mediated stock portfolio allocation, which validate these theoretic results. All code for our experiments can be found at https://github.com/suryakmurthy/dibs_bargaining.
翻译:合作议价博弈被广泛应用于资源分配与冲突解决建模。传统解决方案假设调解者能够获取参与者的效用函数值及其梯度。然而,越来越多场景(例如人机交互)中,由于未知的非仿射变换,效用值可能无法获取或不可直接比较。为对此类场景建模,我们假设调解者仅能获取参与者的最优偏好方向,即决策空间中归一化的效用梯度。为此,我们提出一种合作议价算法,其中调解者仅能访问各参与者的方向预言机。我们证明,与纳什议价解和卡拉伊-斯莫罗金斯基议价解等主流方法不同,本方法对单调非仿射变换具有不变性,且在强凸性与平滑性假设下,该方法能全局渐近收敛至帕累托稳态解。此外,我们证明该算法求得的议价解同样满足对称性公理,并在稍强条件下满足无关方案独立性公理——这两项公理在现有文献中被广泛采用。最后,我们在多智能体编队分配与调解式股票投资组合配置两个领域进行实验,验证了上述理论结果。所有实验代码可见于 https://github.com/suryakmurthy/dibs_bargaining。