Repairing incomplete trajectory data is essential for downstream spatio-temporal applications. Yet, existing repair methods focus solely on reconstruction without documenting the reasoning behind repair decisions, undermining trust in safety-critical applications where repaired trajectories affect operational decisions, such as in maritime anomaly detection and route planning. We introduce repair provenance - structured, queryable metadata that documents the full reasoning chain behind each repair - which transforms imputation from pure data recovery into a task that supports downstream decision-making. We propose VISTA (knowledge-driven interpretable vessel trajectory imputation), a framework that reliably equips repaired trajectories with repair provenance by grounding LLM reasoning in data-verified knowledge. Specifically, we formalize Structured Data-derived Knowledge (SDK), a knowledge model whose data-verifiable components can be validated against real data and used to anchor and constrain LLM-generated explanations. We organize SDK in a Structured Data-derived Knowledge Graph (SD-KG) and establish a data-knowledge-data loop for extraction, validation, and incremental maintenance over large-scale AIS data. A workflow management layer with parallel scheduling, fault tolerance, and redundancy control ensures consistent and efficient end-to-end processing. Experiments on two large-scale AIS datasets show that VISTA achieves state-of-the-art accuracy, improving over baselines by 5-91% and reducing inference time by 51-93%, while producing repair provenance, whose interpretability is further validated through a case study and an interactive demo system.
翻译:修复不完整的轨迹数据对于下游时空应用至关重要。然而,现有的修复方法仅关注轨迹重建,而未记录修复决策背后的推理过程,这削弱了在安全关键应用中对修复结果的信任度——例如在海上异常检测与航线规划中,修复后的轨迹直接影响运营决策。我们提出了修复溯源机制——一种结构化、可查询的元数据,用于完整记录每次修复背后的推理链条——该机制将轨迹补全从纯粹的数据恢复任务转变为能够支持下游决策的任务。我们提出VISTA(基于知识驱动的可解释船舶轨迹补全框架),该框架通过将大语言模型的推理过程锚定在数据可验证的知识上,为修复后的轨迹可靠地配备修复溯源信息。具体而言,我们形式化定义了结构化数据衍生知识模型,该模型的数据可验证组件能够通过真实数据进行校验,并用于锚定和约束大语言模型生成的解释。我们将SDK组织为结构化数据衍生知识图谱,并建立了面向大规模船舶自动识别系统数据的数据-知识-数据闭环,用于知识提取、验证与增量维护。具备并行调度、容错机制和冗余控制的工作流管理层确保了端到端处理的一致性与高效性。在两个大规模AIS数据集上的实验表明:VISTA在保持最先进精度的同时,较基线方法提升5-91%,推理时间减少51-93%,并能同步生成修复溯源信息;其可解释性通过案例研究与交互式演示系统得到了进一步验证。