Travel time estimation is a crucial application in navigation apps and web mapping services. Current deterministic and probabilistic methods primarily focus on modeling individual trips, assuming independence among trips. However, in real-world scenarios, we often observe strong inter-trip correlations due to factors such as weather conditions, traffic management, and road works. In this paper, we propose to model trip-level link travel time using a Gaussian hierarchical model, which can characterize both inter-trip and intra-trip correlations. The joint distribution of travel time of multiple trips becomes a multivariate Gaussian parameterized by learnable link representations. To effectively use the sparse GPS trajectories, we also propose a data augmentation method based on trip sub-sampling, which allows for fine-grained gradient backpropagation in learning link representations. During inference, we estimate the probability distribution of the travel time of a queried trip conditional on the completed trips that are spatiotemporally adjacent. We refer to the overall framework as ProbTTE. We evaluate ProbTTE on two real-world GPS trajectory datasets, and the results demonstrate its superior performance compared to state-of-the-art deterministic and probabilistic baselines. Additionally, we find that the learned link representations align well with the physical geometry of the network, making them suitable as input for other applications.
翻译:旅行时间估计是导航应用和网络地图服务中的关键应用。当前的确定性和概率方法主要侧重于对单次行程进行建模,并假设各次行程之间相互独立。然而,在实际场景中,由于天气条件、交通管理和道路施工等因素,我们经常观察到行程之间存在强烈的相关性。本文提出使用高斯层次模型对行程级别的链路旅行时间进行建模,该模型能够同时刻画行程间和行程内的相关性。多次行程旅行时间的联合分布成为一个由可学习的链路表示参数化的多元高斯分布。为了有效利用稀疏的GPS轨迹数据,我们还提出了一种基于行程子采样的数据增强方法,该方法能够在学习链路表示时实现细粒度的梯度反向传播。在推理阶段,我们基于时空上相邻的已完成行程,估计查询行程旅行时间的条件概率分布。我们将该整体框架称为ProbTTE。我们在两个真实世界的GPS轨迹数据集上评估了ProbTTE,结果表明其性能优于最先进的确定性和概率基线方法。此外,我们发现学习到的链路表示与路网的物理几何结构高度吻合,使其适合作为其他应用的输入。