We study a sequential decision-making problem for a profit-maximizing operator of an Autonomous Mobility-on-Demand system. Optimizing a central operator's vehicle-to-request dispatching policy requires efficient and effective fleet control strategies. To this end, we employ a multi-agent Soft Actor-Critic algorithm combined with weighted bipartite matching. We propose a novel vehicle-based algorithm architecture and adapt the critic's loss function to appropriately consider global actions. Furthermore, we extend our algorithm to incorporate rebalancing capabilities. Through numerical experiments, we show that our approach outperforms state-of-the-art benchmarks by up to 12.9% for dispatching and up to 38.9% with integrated rebalancing.
翻译:我们研究了一个面向利润最大化的自主按需出行系统运营商的序列决策问题。优化中央运营商的车辆-请求调度策略需要高效且有效的车队控制策略。为此,我们采用了一种结合加权二分匹配的多智能体柔性演员-评论家算法。我们提出了一种新颖的基于车辆的算法架构,并调整了评论家的损失函数以适当地考虑全局动作。此外,我们将算法扩展以纳入重新平衡能力。通过数值实验,我们证明所提方法在调度方面优于最先进的基准高达12.9%,在集成重新平衡的情况下高达38.9%。