Delivery Time Estimation (DTE) is a crucial component of the e-commerce supply chain that predicts delivery time based on merchant information, sending address, receiving address, and payment time. Accurate DTE can boost platform revenue and reduce customer complaints and refunds. However, the imbalanced nature of industrial data impedes previous models from reaching satisfactory prediction performance. Although imbalanced regression methods can be applied to the DTE task, we experimentally find that they improve the prediction performance of low-shot data samples at the sacrifice of overall performance. To address the issue, we propose a novel Dual Graph Multitask framework for imbalanced Delivery Time Estimation (DGM-DTE). Our framework first classifies package delivery time as head and tail data. Then, a dual graph-based model is utilized to learn representations of the two categories of data. In particular, DGM-DTE re-weights the embedding of tail data by estimating its kernel density. We fuse two graph-based representations to capture both high- and low-shot data representations. Experiments on real-world Taobao logistics datasets demonstrate the superior performance of DGM-DTE compared to baselines.
翻译:配送时间估计(DTE)是电商供应链中的关键环节,它基于商户信息、发货地址、收货地址和支付时间预测配送时长。准确的DTE能够提升平台收入,并减少客户投诉和退款。然而,工业数据的非平衡特性阻碍了现有模型达到令人满意的预测性能。尽管非平衡回归方法可应用于DTE任务,但实验发现,这些方法在提升低样本数据预测性能的同时,会牺牲整体预测效果。为解决此问题,我们提出了一种面向不平衡配送时间估计的新型双图多任务框架(DGM-DTE)。该框架首先将包裹配送时间划分为头部数据和尾部数据,随后采用基于双图的模型学习这两类数据的表征。特别地,DGM-DTE通过核密度估计对尾部数据的嵌入表示进行重新加权。通过融合两种基于图的表征,我们捕获了高样本与低样本数据的表征。在真实淘宝物流数据集上的实验表明,DGM-DTE相较于基线方法具有更优性能。