The research field of automated negotiation has a long history of designing agents that can negotiate with other agents. Such negotiation strategies are traditionally based on manual design and heuristics. More recently, reinforcement learning approaches have also been used to train agents to negotiate. However, negotiation problems are diverse, causing observation and action dimensions to change, which cannot be handled by default linear policy networks. Previous work on this topic has circumvented this issue either by fixing the negotiation problem, causing policies to be non-transferable between negotiation problems or by abstracting the observations and actions into fixed-size representations, causing loss of information and expressiveness due to feature design. We developed an end-to-end reinforcement learning method for diverse negotiation problems by representing observations and actions as a graph and applying graph neural networks in the policy. With empirical evaluations, we show that our method is effective and that we can learn to negotiate with other agents on never-before-seen negotiation problems. Our result opens up new opportunities for reinforcement learning in negotiation agents.
翻译:自动谈判研究领域长期以来致力于设计能够与其他智能体进行谈判的智能体。此类谈判策略传统上基于人工设计和启发式方法。近年来,强化学习方法也被用于训练谈判智能体。然而,谈判问题具有多样性,导致观测与行动维度发生变化,这无法通过默认的线性策略网络进行处理。该领域的先前研究通过以下两种方式规避了此问题:要么固定谈判问题,导致策略无法在不同谈判问题间迁移;要么将观测与行动抽象为固定维度的表示,由于特征设计导致信息丢失和表达能力下降。我们通过将观测与行动表示为图,并在策略中应用图神经网络,开发了一种适用于多样化谈判问题的端到端强化学习方法。通过实证评估,我们证明了该方法的有效性,并表明我们的方法能够在未见过的谈判问题上学习与其他智能体进行谈判。我们的研究成果为谈判智能体的强化学习应用开辟了新的可能性。