Fault detection and diagnosis is significant for reducing maintenance costs and improving health and safety in chemical processes. Convolution neural network (CNN) is a popular deep learning algorithm with many successful applications in chemical fault detection and diagnosis tasks. However, convolution layers in CNN are very sensitive to the order of features, which can lead to instability in the processing of tabular data. Optimal order of features result in better performance of CNN models but it is expensive to seek such optimal order. In addition, because of the encapsulation mechanism of feature extraction, most CNN models are opaque and have poor interpretability, thus failing to identify root-cause features without human supervision. These difficulties inevitably limit the performance and credibility of CNN methods. In this paper, we propose an order-invariant and interpretable hierarchical dilated convolution neural network (HDLCNN), which is composed by feature clustering, dilated convolution and the shapley additive explanations (SHAP) method. The novelty of HDLCNN lies in its capability of processing tabular data with features of arbitrary order without seeking the optimal order, due to the ability to agglomerate correlated features of feature clustering and the large receptive field of dilated convolution. Then, the proposed method provides interpretability by including the SHAP values to quantify feature contribution. Therefore, the root-cause features can be identified as the features with the highest contribution. Computational experiments are conducted on the Tennessee Eastman chemical process benchmark dataset. Compared with the other methods, the proposed HDLCNN-SHAP method achieves better performance on processing tabular data with features of arbitrary order, detecting faults, and identifying the root-cause features.
翻译:故障检测与诊断对于降低化工过程维护成本、提升健康与安全性具有重要意义。卷积神经网络(CNN)作为一种流行的深度学习算法,在化工故障检测与诊断任务中已取得诸多成功应用。然而,CNN中的卷积层对特征顺序高度敏感,这可能导致表格数据处理的不稳定性。特征的最优顺序虽能提升CNN模型性能,但寻找最优顺序的成本极为高昂。此外,由于特征提取的封装机制,多数CNN模型存在不透明性且可解释性差,因而无法在没有人工监督的情况下识别根因特征。这些困难不可避免地限制了CNN方法的性能与可信度。本文提出一种序不变且可解释的分层膨胀卷积神经网络(HDLCNN),该网络由特征聚类、膨胀卷积和沙普利加法解释(SHAP)方法构成。HDLCNN的创新性在于:凭借特征聚类对相关特征的聚合能力以及膨胀卷积的大感受野特性,它能够处理任意特征顺序的表格数据,无需寻找最优顺序。随后,该方法通过引入SHAP值量化特征贡献,实现了可解释性。因此,贡献度最高的特征可被识别为根因特征。基于田纳西-伊斯曼化工过程基准数据集的计算实验表明:与其它方法相比,所提出的HDLCNN-SHAP方法在处理任意特征顺序的表格数据、故障检测及根因特征识别方面均取得了更优性能。