Feature selection refers to the process of selecting useful features for machine learning tasks, and it is also a key step for structural health monitoring (SHM). This paper proposes a fast feature selection algorithm by efficiently computing the sum of squared canonical correlation coefficients between monitored features and target variables of interest in greedy search. The proposed algorithm is applied to both synthetic and real datasets to illustrate its advantages in terms of computational speed, general classification and regression tasks, as well as damage-sensitive feature selection tasks. Furthermore, the performance of the proposed algorithm is evaluated under varying environmental conditions and on an edge computing device to investigate its applicability in real-world SHM scenarios. The results show that the proposed algorithm can successfully select useful features with extraordinarily fast computational speed, which implies that the proposed algorithm has great potential where features need to be selected and updated online frequently, or where devices have limited computing capability.
翻译:特征选择是指为机器学习任务选取有用特征的过程,也是结构健康监测(SHM)的关键步骤。本文提出一种快速特征选择算法,该算法通过在贪婪搜索中高效计算监测特征与目标变量之间的典型相关系数平方和来实现。所提算法在合成数据集和真实数据集上均得到应用,以展示其在计算速度、通用分类与回归任务以及损伤敏感特征选择任务方面的优势。此外,本文还在变化环境条件下及边缘计算设备上评估了所提算法的性能,以探究其在真实世界SHM场景中的适用性。结果表明,所提算法能够以极快的计算速度成功选取有用特征,这意味着该算法在需要频繁在线选择与更新特征或设备计算能力受限的场景中具有巨大潜力。