Despite recent advancements in AI and NLP, negotiation remains a difficult domain for AI agents. Traditional game theoretic approaches that have worked well for two-player zero-sum games struggle in the context of negotiation due to their inability to learn human-compatible strategies. On the other hand, approaches that only use human data tend to be domain-specific and lack the theoretical guarantees provided by strategies grounded in game theory. Motivated by the notion of fairness as a criterion for optimality in general sum games, we propose a negotiation framework called FDHC which incorporates fairness into both the reward design and search to learn human-compatible negotiation strategies. Our method includes a novel, RL+search technique called LGM-Zero which leverages a pre-trained language model to retrieve human-compatible offers from large action spaces. Our results show that our method is able to achieve more egalitarian negotiation outcomes and improve negotiation quality.
翻译:尽管人工智能和自然语言处理领域近期取得了进展,谈判对于AI智能体而言仍然是一个困难的领域。在双人零和博弈中表现良好的传统博弈论方法,由于无法学习与人类兼容的策略,在谈判情境中面临困境。另一方面,仅使用人类数据的方法往往局限于特定领域,且缺乏基于博弈论的策略所提供的理论保证。受公平性作为一般和博弈中最优性准则这一概念的启发,我们提出了一种名为FDHC的谈判框架,该框架将公平性融入奖励设计和搜索过程,以学习人类兼容的谈判策略。我们的方法包含一种新颖的RL+搜索技术,称为LGM-Zero,它利用预训练的语言模型从大规模动作空间中检索人类兼容的出价。我们的结果表明,该方法能够达成更平等的谈判结果并提升谈判质量。