Scalable Ride-Sourcing Vehicle Rebalancing with Service Accessibility Guarantee: A Constrained Mean-Field Reinforcement Learning Approach

The expansion of ride-sourcing services such as Uber and Lyft has reshaped urban transportation by offering flexible, on-demand mobility via mobile applications. Despite convenience, these platforms confront significant operational challenges, particularly vehicle rebalancing-strategic repositioning of a fleet of vehicles to address spatiotemporal mismatches in supply and demand. Inadequate rebalancing results in prolonged rider waiting times and inefficient vehicle utilization, but also leads to fairness issues, such as the inequitable distribution of service and disparities in driver income. To tackle these, we introduce continuous-state mean-field control (MFC) and mean-field reinforcement learning (MFRL) models with continuous repositioning actions. MFC and MFRL offer scalable solutions by modeling each vehicle's behavior through interaction with the vehicle distribution, rather than with individual vehicles. This mitigates the curse of dimensionality with respect to the number of agents, enabling coordination across large fleets with significantly reduced computational complexity and eliminating the need to retrain the model when fleet size changes. To ensure equitable service access across geographic regions, we integrate an accessibility constraint into models and derive rebalancing policies that strike a balance between high fulfillment of rider demand and fair coverage of vehicle supply. Extensive evaluation using data-driven simulation of Shenzhen demonstrates the efficiency and robustness of our approach. Remarkably, it scales to tens of thousands of vehicles, with training times comparable to linear programming rebalancing. Besides, our policies effectively explore the efficiency-equity Pareto front, outperforming conventional benchmarks across key metrics like fleet utilization, fulfilled requests, and pickup distance, while ensuring equitable service access.

翻译：以Uber和Lyft为代表的网约车服务通过移动应用提供灵活、按需的出行服务，重塑了城市交通格局。尽管带来了便利，但这些平台面临严峻的运营挑战，尤其是车辆再平衡——即战略性地重新部署车队以解决供需的时空错配。再平衡不足不仅会导致乘客等待时间延长和车辆利用率低下，还会引发公平性问题，例如服务分布不均和驾驶员收入差距。为解决这些问题，我们引入了具有连续再平衡动作的连续状态平均场控制（MFC）和平均场强化学习（MFRL）模型。MFC和MFRL通过建模每辆车与车辆分布（而非其他个体车辆）的交互提供可扩展的解决方案。这缓解了智能体数量维度的诅咒，使得大规模车队协调的计算复杂度显著降低，并消除了车队规模变化时重新训练模型的需要。为确保跨地理区域的公平服务接入，我们在模型中融入了可达性约束，并推导出能在实现乘客需求高满足率与车辆供应公平覆盖之间取得平衡的再平衡策略。基于深圳市数据驱动仿真的广泛评估证明了我们方法的效率和鲁棒性。值得注意的是，该方法可扩展至数万辆车辆规模，其训练时间与线性规划再平衡方法相当。此外，我们的策略有效探索了效率-公平的帕累托前沿，在车队利用率、需求满足率、接驾距离等关键指标上优于传统基准方法，同时确保了公平的服务接入。