Delivery Time Estimation (DTE) is a crucial component of the e-commerce supply chain that predicts delivery time based on merchant information, sending address, receiving address, and payment time. Accurate DTE can boost platform revenue and reduce customer complaints and refunds. However, the imbalanced nature of industrial data impedes previous models from reaching satisfactory prediction performance. Although imbalanced regression methods can be applied to the DTE task, we experimentally find that they improve the prediction performance of low-shot data samples at the sacrifice of overall performance. To address the issue, we propose a novel Dual Graph Multitask framework for imbalanced Delivery Time Estimation (DGM-DTE). Our framework first classifies package delivery time as head and tail data. Then, a dual graph-based model is utilized to learn representations of the two categories of data. In particular, DGM-DTE re-weights the embedding of tail data by estimating its kernel density. We fuse two graph-based representations to capture both high- and low-shot data representations. Experiments on real-world Taobao logistics datasets demonstrate the superior performance of DGM-DTE compared to baselines.
翻译:配送时间估计(DTE)是电子商务供应链中的关键环节,它基于商家信息、发货地址、收货地址和支付时间来预测配送时间。准确的DTE能够提升平台收入,并减少客户投诉和退款。然而,工业数据的不平衡性阻碍了现有模型取得令人满意的预测性能。尽管不平衡回归方法可应用于DTE任务,但我们通过实验发现,这些方法在提升低样本数据预测性能的同时,会牺牲整体性能。为解决这一问题,我们提出了一种新颖的面向不平衡配送时间估计的双图多任务框架(DGM-DTE)。该框架首先将包裹配送时间分为头部数据和尾部数据,然后利用基于双图的模型学习这两类数据的表征。具体而言,DGM-DTE通过估计尾部数据的核密度对其嵌入进行重新加权。我们将两种基于图的表征进行融合,以同时捕捉高样本和低样本数据的表征。在真实淘宝物流数据集上的实验表明,DGM-DTE相较于基线方法具有更优越的性能。