The integrity of Water Quality Data (WQD) is critical in environmental monitoring for scientific decision-making and ecological protection. However, water quality monitoring systems are often challenged by large amounts of missing data due to unavoidable problems such as sensor failures and communication delays, which further lead to water quality data becoming High-Dimensional and Sparse (HDS). Traditional data imputation methods are difficult to depict the potential dynamics and fail to capture the deep data features, resulting in unsatisfactory imputation performance. To effectively address the above issues, this paper proposes a Nonlinear Low-rank Representation model (NLR) with Convolutional Neural Networks (CNN) for imputing missing WQD, which utilizes CNNs to implement two ideas: a) fusing temporal features to model the temporal dependence of data between time slots, and b) Extracting nonlinear interactions and local patterns to mine higher-order relationships features and achieve deep fusion of multidimensional information. Experimental studies on three real water quality datasets demonstrate that the proposed model significantly outperforms existing state-of-the-art data imputation models in terms of estimation accuracy. It provides an effective approach for handling water quality monitoring data in complex dynamic environments.
翻译:水质数据(WQD)的完整性对于环境监测中的科学决策与生态保护至关重要。然而,水质监测系统常因传感器故障、通信延迟等不可避免的问题而面临大量数据缺失的挑战,这进一步导致水质数据呈现高维稀疏(HDS)特性。传统的数据填补方法难以刻画潜在的动态特性,也无法捕捉深层数据特征,导致填补性能不尽如人意。为有效解决上述问题,本文提出一种结合卷积神经网络(CNN)的非线性低秩表示模型(NLR),用于填补缺失的水质数据。该模型利用CNN实现两个核心思想:a)融合时序特征以建模时间片之间的数据时序依赖性;b)提取非线性交互与局部模式,以挖掘高阶关系特征并实现多维信息的深度融合。在三个真实水质数据集上的实验研究表明,所提模型在估计精度方面显著优于现有最先进的数据填补模型,为处理复杂动态环境下的水质监测数据提供了一种有效途径。