Location-tracking data from the Automatic Identification System, much of which is publicly available, plays a key role in a range of maritime safety and monitoring applications. However, the data suffers from missing values that hamper downstream applications. Imputing the missing values is challenging because the values of different heterogeneous attributes are updated at diverse rates, resulting in the occurrence of multi-scale dependencies among attributes. Existing imputation methods that assume similar update rates across attributes are unable to capture and exploit such dependencies, limiting their imputation accuracy. We propose MH-GIN, a Multi-scale Heterogeneous Graph-based Imputation Network that aims improve imputation accuracy by capturing multi-scale dependencies. Specifically, MH-GIN first extracts multi-scale temporal features for each attribute while preserving their intrinsic heterogeneous characteristics. Then, it constructs a multi-scale heterogeneous graph to explicitly model dependencies between heterogeneous attributes to enable more accurate imputation of missing values through graph propagation. Experimental results on two real-world datasets find that MH-GIN is capable of an average 57% reduction in imputation errors compared to state-of-the-art methods, while maintaining computational efficiency. The source code and implementation details of MH-GIN are publicly available https://github.com/hyLiu1994/MH-GIN.
翻译:来自自动识别系统的位置追踪数据(其中大部分可公开获取)在各类海事安全与监测应用中发挥着关键作用。然而,该数据存在缺失值问题,制约了下游应用。缺失值补全具有挑战性,因为不同异构属性的数值更新速率各异,导致属性间存在多尺度依赖关系。现有补全方法假设各属性具有相似的更新速率,无法捕捉和利用此类依赖关系,限制了其补全精度。我们提出MH-GIN——一种基于多尺度异构图的补全网络,旨在通过捕捉多尺度依赖关系提升补全精度。具体而言,MH-GIN首先为每个属性提取多尺度时序特征,同时保持其内在的异构特性;随后构建多尺度异构图,显式建模异构属性间的依赖关系,通过图传播实现更精确的缺失值补全。在两个真实数据集上的实验结果表明,相较于最先进的方法,MH-GIN平均能降低57%的补全误差,同时保持计算效率。MH-GIN的源代码与实现细节已公开:https://github.com/hyLiu1994/MH-GIN。