Data quality assessment is an essential step that ensures the reliability of the subsequent structural health monitoring (SHM) tasks. This study proposes a prediction deviation-based SHM data quality assessment method using a univariate implicit auto-regressive model, enabling outlier diagnosis and data cleaning. The proposed conditional diffusion model (CDM) augments the standard diffusion model with a conditional embedding module to incorporate temporal context, quartile normalization to mitigate distribution skew, and a Huber loss to enhance robustness against outliers. Within this univariate implicit autoregressive framework, each data point is assigned an outlier probability, quantifying its degree of "outlier-ness", and a global quality evaluation score is computed to characterize the overall dataset quality. Extensive case studies utilizing operational data from real-world structures demonstrate that the proposed framework significantly improves the accuracy of data quality assessment, outperforming other strong baselines representative of clustering, isolation-based, and deep reconstruction methods. The effectiveness and robustness of the proposed framework are further demonstrated by the findings of ablation experiments and hyperparameter analysis.
翻译:数据质量评估是保障后续结构健康监测任务可靠性的关键步骤。本研究提出一种基于预测偏差的结构健康监测数据质量评估方法,通过构建单变量隐式自回归模型实现异常诊断与数据清洗。所提出的条件扩散模型通过条件嵌入模块融入时序上下文信息,采用四分位归一化缓解数据分布偏斜,并引入Huber损失增强对异常值的鲁棒性。在该单变量隐式自回归框架中,每个数据点被赋予异常概率以量化其偏离程度,同时计算全局质量评估分数以表征整体数据集质量。基于真实结构运行数据的综合案例研究表明,所提框架显著提升了数据质量评估的准确性,性能优于聚类、孤立点检测及深度重建方法等强基线模型。消融实验与超参数分析结果进一步验证了该框架的有效性与鲁棒性。