Uncertainty-Aware Delivery Delay Duration Prediction via Multi-Task Deep Learning

Accurate delivery delay prediction is critical for maintaining operational efficiency and customer satisfaction across modern supply chains. Yet the increasing complexity of logistics networks, spanning multimodal transportation, cross-country routing, and pronounced regional variability, makes this prediction task inherently challenging. This paper introduces a multi-task deep learning model for delivery delay duration prediction in the presence of significant imbalanced data, where delayed shipments are rare but operationally consequential. The model embeds high-dimensional shipment features with dedicated embedding layers for tabular data, and then uses a classification-then-regression strategy to predict the delivery delay duration for on-time and delayed shipments. Unlike sequential pipelines, this approach enables end-to-end training, improves the detection of delayed cases, and supports probabilistic forecasting for uncertainty-aware decision making. The proposed approach is evaluated on a large-scale real-world dataset from an industrial partner, comprising more than 10 million historical shipment records across four major source locations with distinct regional characteristics. The proposed model is compared with traditional machine learning methods. Experimental results show that the proposed method achieves a mean absolute error of 0.67-0.91 days for delayed-shipment predictions, outperforming single-step tree-based regression baselines by 41-64% and two-step classify-then-regress tree-based models by 15-35%. These gains demonstrate the effectiveness of the proposed model in operational delivery delay forecasting under highly imbalanced and heterogeneous conditions.

翻译：精准的交付延迟预测对于维持现代供应链的运营效率和客户满意度至关重要。然而，物流网络日益复杂，涉及多式联运、跨国路线以及显著的区域差异性，使得这一预测任务本身具有挑战性。本文针对存在严重数据不平衡（即延迟运单稀少但运营影响重大）的情况，提出了一种用于交付延迟时长预测的多任务深度学习模型。该模型通过专用的嵌入层处理表格数据，以嵌入高维运单特征，随后采用"先分类后回归"策略来预测准时与延迟运单的交付延迟时长。与顺序处理流程不同，该方法支持端到端训练，提高了延迟案例的检测能力，并为支持不确定性感知的决策提供了概率性预测。所提出的方法在一个来自工业合作伙伴的大规模真实数据集上进行了评估，该数据集包含来自四个具有不同区域特征的主要始发地的超过1000万条历史运单记录。本文提出的模型与传统机器学习方法进行了比较。实验结果表明，所提出的方法在延迟运单预测上实现了0.67-0.91天的平均绝对误差，优于基于树的单步回归基线模型41-64%，也优于基于树的"先分类后回归"两步模型15-35%。这些性能提升证明了所提模型在高度不平衡和异构条件下进行运营交付延迟预测的有效性。