Spatio-temporal trajectories provide valuable information about movement and travel behavior, enabling various downstream tasks that in turn power real-world applications. Learning trajectory embeddings can improve task performance but may incur high computational costs and face limited training data availability. Pre-training learns generic embeddings by means of specially constructed pretext tasks that enable learning from unlabeled data. Existing pre-training methods face (i) difficulties in learning general embeddings due to biases towards certain downstream tasks incurred by the pretext tasks, (ii) limitations in capturing both travel semantics and spatio-temporal correlations, and (iii) the complexity of long, irregularly sampled trajectories. To tackle these challenges, we propose Maximum Multi-view Trajectory Entropy Coding (MMTEC) for learning general and comprehensive trajectory embeddings. We introduce a pretext task that reduces biases in pre-trained trajectory embeddings, yielding embeddings that are useful for a wide variety of downstream tasks. We also propose an attention-based discrete encoder and a NeuralCDE-based continuous encoder that extract and represent travel behavior and continuous spatio-temporal correlations from trajectories in embeddings, respectively. Extensive experiments on two real-world datasets and three downstream tasks offer insight into the design properties of our proposal and indicate that it is capable of outperforming existing trajectory embedding methods.
翻译:时空轨迹蕴含了关于移动与出行行为的重要信息,可支持多种下游任务并驱动实际应用。学习轨迹嵌入能够提升任务性能,但可能面临高昂的计算成本与训练数据有限的挑战。预训练通过设计特殊的预文任务从无标注数据中学习通用嵌入。现有预训练方法存在以下问题:(i)预文任务导致对特定下游任务的偏好,难以学习通用嵌入;(ii)在同时捕获出行语义与时空相关性方面存在局限;(iii)难以处理长序列、非均匀采样的复杂轨迹。针对这些挑战,我们提出最大多视角轨迹熵编码(MMTEC)方法,用于学习通用且全面的轨迹嵌入。我们设计了一种预文任务,能够减少预训练轨迹嵌入中的偏好,使其适用于多种下游任务。同时,我们提出了基于注意力的离散编码器和基于神经条件微分方程的连续编码器,分别从轨迹中提取出行行为特征与连续时空相关性。在两个真实数据集和三个下游任务上的大量实验揭示了所提方法的设计特性,并表明其能够超越现有轨迹嵌入方法。