As large volumes of trajectory data accumulate, simplifying trajectories to reduce storage and querying costs is increasingly studied. Existing proposals face three main problems. First, they require numerous iterations to decide which GPS points to delete. Second, they focus only on the relationships between neighboring points (local information) while neglecting the overall structure (global information), reducing the global similarity between the simplified and original trajectories and making it difficult to maintain consistency in query results, especially for similarity-based queries. Finally, they fail to differentiate the importance of points with similar features, leading to suboptimal selection of points to retain the original trajectory information. We propose MLSimp, a novel Mutual Learning query-driven trajectory simplification framework that integrates two distinct models: GNN-TS, based on graph neural networks, and Diff-TS, based on diffusion models. GNN-TS evaluates the importance of a point according to its globality, capturing its correlation with the entire trajectory, and its uniqueness, capturing its differences from neighboring points. It also incorporates attention mechanisms in the GNN layers, enabling simultaneous data integration from all points within the same trajectory and refining representations, thus avoiding iterative processes. Diff-TS generates amplified signals to enable the retention of the most important points at low compression rates. Experiments involving eight baselines on three databases show that MLSimp reduces the simplification time by 42%--70% and improves query accuracy over simplified trajectories by up to 34.6%.
翻译:随着海量轨迹数据的积累,简化轨迹以降低存储与查询成本的研究日益受到关注。现有方法主要面临三个问题。首先,它们需要多次迭代来决定删除哪些GPS点。其次,它们仅关注相邻点之间的关系(局部信息),而忽略了整体结构(全局信息),这降低了简化轨迹与原始轨迹之间的全局相似性,并使得查询结果(尤其是基于相似性的查询)难以保持一致性。最后,它们未能区分具有相似特征的点的重要性,导致保留原始轨迹信息的点选择次优。我们提出了MLSimp,一种新颖的互学习查询驱动轨迹简化框架,该框架整合了两种不同的模型:基于图神经网络的GNN-TS和基于扩散模型的Diff-TS。GNN-TS根据点的全局性(捕捉其与整个轨迹的关联)和独特性(捕捉其与相邻点的差异)来评估点的重要性。它还在GNN层中引入了注意力机制,能够同时整合同一轨迹内所有点的数据并优化表示,从而避免了迭代过程。Diff-TS通过生成放大信号,使得在低压缩率下仍能保留最重要的点。在三个数据库上对八种基线方法进行的实验表明,MLSimp将简化时间减少了42%--70%,并将简化轨迹上的查询准确率最高提升了34.6%。