Neural Approximate Dynamic Programming for the Ultra-fast Order Dispatching Problem

Same-Day Delivery (SDD) services aim to maximize the fulfillment of online orders while minimizing delivery delays but are beset by operational uncertainties such as those in order volumes and courier planning. Our work aims to enhance the operational efficiency of SDD by focusing on the ultra-fast Order Dispatching Problem (ODP), which involves matching and dispatching orders to couriers within a centralized warehouse setting, and completing the delivery within a strict timeline (e.g., within minutes). We introduce important extensions to ultra-fast ODP such as order batching and explicit courier assignments to provide a more realistic representation of dispatching operations and improve delivery efficiency. As a solution method, we primarily focus on NeurADP, a methodology that combines Approximate Dynamic Programming (ADP) and Deep Reinforcement Learning (DRL), and our work constitutes the first application of NeurADP outside of the ride-pool matching problem. NeurADP is particularly suitable for ultra-fast ODP as it addresses complex one-to-many matching and routing intricacies through a neural network-based VFA that captures high-dimensional problem dynamics without requiring manual feature engineering as in generic ADP methods. We test our proposed approach using four distinct realistic datasets tailored for ODP and compare the performance of NeurADP against myopic and DRL baselines by also making use of non-trivial bounds to assess the quality of the policies. Our numerical results indicate that the inclusion of order batching and courier queues enhances the efficiency of delivery operations and that NeurADP significantly outperforms other methods. Detailed sensitivity analysis with important parameters confirms the robustness of NeurADP under different scenarios, including variations in courier numbers, spatial setup, vehicle capacity, and permitted delay time.

翻译：当日达配送服务旨在最大化在线订单的履行率并最小化配送延迟，但其运营面临订单量波动和骑手规划等不确定性。本研究致力于通过聚焦超快速订单配送问题提升当日达运营效率，该问题涉及在集中式仓库环境中将订单匹配并派发给骑手，并在严格时限（如数分钟内）完成配送。我们为超快速订单配送问题引入重要扩展，包括订单分批处理和显式骑手分配，以更真实地反映配送操作并提高配送效率。作为解决方案，我们主要采用基于近似动态规划与深度强化学习相结合的方法NeurADP，这是NeurADP首次被应用于拼车匹配问题之外。NeurADP特别适用于超快速订单配送问题，因其通过基于神经网络的VFA处理复杂的一对多匹配与路径规划难题，该VFA能捕捉高维问题动态，无需像传统ADP方法那样进行手工特征工程。我们使用四个专为订单配送问题设计的差异化真实数据集测试所提方法，并通过利用非平凡边界对比NeurADP与短视方法及深度强化学习基准的性能，以评估策略质量。数值结果表明，纳入订单分批和骑手队列可提升配送运营效率，且NeurADP显著优于其他方法。对关键参数的详细敏感性分析证实了NeurADP在不同场景（包括骑手数量、空间布局、车辆容量及允许延迟时间变化）下的鲁棒性。