VISTA: Knowledge-Driven Interpretable Vessel Trajectory Imputation via Large Language Models

The Automatic Identification System provides critical information for maritime navigation and safety, yet its trajectories are often incomplete due to signal loss or deliberate tampering. Existing imputation methods emphasize trajectory recovery, paying limited attention to interpretability and failing to provide underlying knowledge that benefits downstream tasks such as anomaly detection and route planning. We propose knowledge-driven interpretable vessel trajectory imputation (VISTA), the first trajectory imputation framework that offers interpretability while simultaneously providing underlying knowledge to support downstream analysis. Specifically, we first define underlying knowledge as a combination of Structured Data-derived Knowledge (SDK) distilled from AIS data and Implicit LLM Knowledge acquired from large-scale Internet corpora. Second, to manage and leverage the SDK effectively at scale, we develop a data-knowledge-data loop that employs a Structured Data-derived Knowledge Graph for SDK extraction and knowledge-driven trajectory imputation. Third, to efficiently process large-scale AIS data, we introduce a workflow management layer that coordinates the end-to-end pipeline, enabling parallel knowledge extraction and trajectory imputation with anomaly handling and redundancy elimination. Experiments on two large AIS datasets show that VISTA is capable of state-of-the-art imputation accuracy and computational efficiency, improving over state-of-the-art baselines by 5%-94% and reducing time cost by 51%-93%, while producing interpretable knowledge cues that benefit downstream tasks. The source code and implementation details of VISTA are publicly available.

翻译：自动识别系统为海上航行与安全提供了关键信息，但其轨迹常因信号丢失或人为篡改而不完整。现有补全方法侧重于轨迹恢复，对可解释性关注有限，且未能提供有益于异常检测与航线规划等下游任务的底层知识。我们提出了知识驱动的可解释船舶轨迹补全框架VISTA，这是首个在提供可解释性的同时，能够为下游分析提供底层知识的轨迹补全框架。具体而言，我们首先将底层知识定义为从AIS数据提炼的结构化数据衍生知识与从大规模互联网语料获取的隐式大语言模型知识的结合。其次，为有效管理并规模化利用结构化数据衍生知识，我们开发了一种数据-知识-数据循环机制，该机制采用结构化数据衍生知识图谱进行知识提取与知识驱动的轨迹补全。第三，为高效处理大规模AIS数据，我们引入了工作流管理层来协调端到端流程，实现并行知识提取与具备异常处理和冗余消除功能的轨迹补全。在两个大型AIS数据集上的实验表明，VISTA能够实现最先进的补全精度与计算效率，较现有最优基线方法提升5%-94%，时间成本降低51%-93%，同时生成有益于下游任务的可解释知识线索。VISTA的源代码与实现细节已公开。