Trajectories that capture object movement have numerous applications, in which similarity computation between trajectories often plays a key role. Traditionally, the similarity between two trajectories is quantified by means of heuristic measures, e.g., Hausdorff or ERP, that operate directly on the trajectories. In contrast, recent studies exploit deep learning to map trajectories to d-dimensional vectors, called embeddings. Then, some distance measure, e.g., Manhattan or Euclidean, is applied to the embeddings to quantify trajectory similarity. The resulting similarities are inaccurate: they only approximate the similarities obtained using the heuristic measures. As distance computation on embeddings is efficient, focus has been on achieving embeddings yielding high accuracy. Adopting an efficiency perspective, we analyze the time complexities of both the heuristic and the learning-based approaches, finding that the time complexities of the former approaches are not necessarily higher. Through extensive experiments on open datasets, we find that, on both CPUs and GPUs, only a few learning-based approaches can deliver the promised higher efficiency, when the embeddings can be pre-computed, while heuristic approaches are more efficient for one-off computations. Among the learning-based approaches, the self-attention-based ones are the fastest to learn embeddings that also yield the highest accuracy for similarity queries. These results have implications for the use of trajectory similarity approaches given different application requirements.
翻译:捕捉物体移动的轨迹数据具有广泛的应用场景,其中轨迹间的相似性计算往往发挥着关键作用。传统方法通过诸如Hausdorff或ERP等启发式度量直接作用于原始轨迹来量化相似性。相比之下,近年研究利用深度学习将轨迹映射为d维向量(称为嵌入表示),随后采用曼哈顿距离或欧氏距离等度量方式计算嵌入向量的相似度。但这种相似性结果存在误差:它们仅能近似传统启发式度量的计算结果。由于基于嵌入的向量距离计算具有高效性,当前研究重点集中在如何获得高精度的嵌入表示。本文从效率视角出发,系统分析了启发式方法与基于学习方法的时空复杂度,发现前者的复杂度并非必然更高。通过在公开数据集上的大量实验,我们发现:在CPU和GPU环境下,只有少数基于学习的方法在嵌入可预计算时能实现承诺的高效性,而启发式方法在一次性计算场景中更具效率。在基于学习的方法中,基于自注意力机制的方法不仅学习嵌入的速度最快,还能生成相似性查询精度最高的嵌入表示。这些发现对于根据具体应用需求选择轨迹相似性计算方法具有重要参考价值。