System outputs such as eigenfrequencies or strain data, often used in structural health monitoring (SHM), not only react to damage but also depend on environmental conditions. When trying to correct for these confounding effects, it is often (at least implicitly) assumed that only the expected, i.e., mean, output values are affected by environmental conditions. However, the evaluation of real-world SHM data indicates that environmental conditions may influence not only the mean output but also higher-order statistical moments, particularly the variances of and the covariances and correlations between the output quantities, such as eigenfrequencies of different modes or strain sensors at different locations. To address these issues, we discuss two approaches for identifying and quantifying multivariate confounding effects on output covariances and correlations: a random forest and a nonparametric, kernel-based approach. We compare the two competing methods on both artificial and real-world SHM data, finding that the kernel-based approach achieves higher accuracy, but the random forest produces estimates that are more robust and sometimes easier to interpret.
翻译:系统输出(例如特征频率或应变数据)常用于结构健康监测(SHM),这些输出不仅对损伤有反应,还依赖于环境条件。在试图纠正这些混杂效应时,通常(至少隐含地)假设只有输出的期望值(即均值)受环境条件影响。然而,对真实SHM数据的评估表明,环境条件可能不仅影响输出均值,还影响高阶统计矩,尤其是输出量(例如不同模态的特征频率或不同位置的应变传感器)的方差以及它们之间的协方差和相关性。为解决这些问题,我们讨论了两种识别和量化输出协方差与相关性的多变量混杂效应的方法:随机森林方法和基于非参数核的方法。我们在人工数据和真实SHM数据上比较了这两种竞争方法,发现基于核的方法准确率更高,而随机森林产生的估计更稳健,有时更易于解释。