Collaborative vehicle routing occurs when carriers collaborate through sharing their transportation requests and performing transportation requests on behalf of each other. This achieves economies of scale, thus reducing cost, greenhouse gas emissions and road congestion. But which carrier should partner with whom, and how much should each carrier be compensated? Traditional game theoretic solution concepts are expensive to calculate as the characteristic function scales exponentially with the number of agents. This would require solving the vehicle routing problem (NP-hard) an exponential number of times. We therefore propose to model this problem as a coalitional bargaining game solved using deep multi-agent reinforcement learning, where - crucially - agents are not given access to the characteristic function. Instead, we implicitly reason about the characteristic function; thus, when deployed in production, we only need to evaluate the expensive post-collaboration vehicle routing problem once. Our contribution is that we are the first to consider both the route allocation problem and gain sharing problem simultaneously - without access to the expensive characteristic function. Through decentralised machine learning, our agents bargain with each other and agree to outcomes that correlate well with the Shapley value - a fair profit allocation mechanism. Importantly, we are able to achieve a reduction in run-time of 88%.
翻译:协同车辆路径规划是指运输企业通过共享运输请求并相互代为执行任务而展开合作。这种合作能够实现规模经济,从而降低运输成本、减少温室气体排放并缓解道路拥堵。但哪个企业应与谁合作?每个企业又应获得多少补偿?传统博弈论解概念的计算成本极高,因为其特征函数随智能体数量呈指数级增长,这意味着需要求解指数次车辆路径问题(NP难问题)。为此,我们提出将该问题建模为联盟讨价还价博弈,采用深度多智能体强化学习进行求解——关键之处在于,智能体无法访问特征函数,而是隐式地对其加以推理。因此,在实际生产部署中,我们仅需对昂贵的合作后车辆路径问题求解一次。我们的贡献在于:首次在不依赖昂贵特征函数的前提下同时考虑路径分配与收益共享问题。通过去中心化机器学习,智能体之间相互讨价还价并达成与公平利润分配机制——沙普利值高度相关的合作结果。重要的是,我们实现了88%的运行时间缩减。