Travel time estimation is a key task in navigation apps and web mapping services. Existing deterministic and probabilistic methods, based on the assumption of trip independence, predominantly focus on modeling individual trips while overlooking trip correlations. However, real-world conditions frequently introduce strong correlations between trips, influenced by external and internal factors such as weather and the tendencies of drivers. To address this, we propose a deep hierarchical joint probabilistic model ProbETA for travel time estimation, capturing both inter-trip and intra-trip correlations. The joint distribution of travel times across multiple trips is modeled as a low-rank multivariate Gaussian, parameterized by learnable link representations estimated using the empirical Bayes approach. We also introduce a data augmentation method based on trip sub-sampling, allowing for fine-grained gradient backpropagation when learning link representations. During inference, our model estimates the probability distribution of travel time for a queried trip, conditional on spatiotemporally adjacent completed trips. Evaluation on two real-world GPS trajectory datasets demonstrates that ProbETA outperforms state-of-the-art deterministic and probabilistic baselines, with Mean Absolute Percentage Error decreasing by over 12.60%. Moreover, the learned link representations align with the physical network geometry, potentially making them applicable for other tasks.
翻译:旅行时间估计是导航应用和网络地图服务中的关键任务。现有的确定性和概率性方法基于行程独立性假设,主要侧重于对单个行程进行建模,而忽略了行程间的相关性。然而,现实世界条件(如天气和驾驶员倾向等外部与内部因素)常导致行程间存在强相关性。为解决此问题,我们提出了一种用于旅行时间估计的深度层次联合概率模型 ProbETA,该模型同时捕获了行程间与行程内的相关性。多个行程的旅行时间联合分布被建模为一个低秩多元高斯分布,其参数由使用经验贝叶斯方法估计的可学习链路表示进行参数化。我们还引入了一种基于行程子采样的数据增强方法,使得在学习链路表示时能够进行细粒度的梯度反向传播。在推理阶段,我们的模型以时空相邻的已完成行程为条件,估计查询行程的旅行时间概率分布。在两个真实世界 GPS 轨迹数据集上的评估表明,ProbETA 优于最先进的确定性和概率性基线模型,平均绝对百分比误差降低了超过 12.60%。此外,学习到的链路表示与物理网络几何结构相吻合,这使其可能适用于其他任务。