Data-driven industrial health prognostics require rich training data to develop accurate and reliable predictive models. However, stringent data privacy laws and the abundance of edge industrial data necessitate decentralized data utilization. Thus, the industrial health prognostics field is well suited to significantly benefit from federated learning (FL), a decentralized and privacy-preserving learning technique. However, FL-based health prognostics tasks have hardly been investigated due to the complexities of meaningfully aggregating model parameters trained from heterogeneous data to form a high performing federated model. Specifically, data heterogeneity among edge devices, stemming from dissimilar degradation mechanisms and unequal dataset sizes, poses a critical statistical challenge for developing accurate federated models. We propose a pioneering FL-based health prognostic model with a feature similarity-matched parameter aggregation algorithm to discriminatingly learn from heterogeneous edge data. The algorithm searches across the heterogeneous locally trained models and matches neurons with probabilistically similar feature extraction functions first, before selectively averaging them to form the federated model parameters. As the algorithm only averages similar neurons, as opposed to conventional naive averaging of coordinate-wise neurons, the distinct feature extractors of local models are carried over with less dilution to the resultant federated model. Using both cyclic degradation data of Li-ion batteries and non-cyclic data of turbofan engines, we demonstrate that the proposed method yields accuracy improvements as high as 44.5\% and 39.3\% for state-of-health estimation and remaining useful life estimation, respectively.
翻译:数据驱动的工业健康预测需要丰富的训练数据来构建准确可靠的预测模型。然而,严格的数据隐私法规与边缘工业数据的广泛存在,要求实现数据去中心化利用。因此,工业健康预测领域非常适合从联邦学习(FL)这一去中心化且保护隐私的学习技术中显著受益。然而,由于从异构数据中训练得到的模型参数难以进行有意义的聚合以形成高性能联邦模型,基于联邦学习的健康预测任务鲜有研究。具体而言,边缘设备间因退化机制差异和数据集规模不均导致的数据异构性,为开发精准联邦模型带来了严峻的统计挑战。我们提出了一种开创性的基于联邦学习的健康预测模型,该模型采用基于特征相似性匹配的参数聚合算法,能够从异构边缘数据中进行区分性学习。该算法首先在异构局部训练模型中进行搜索,匹配具有概率相似特征提取函数的神经元,然后对其进行选择性平均以形成联邦模型参数。与传统的坐标逐点神经元朴素平均方法不同,该算法仅对相似神经元进行平均,因此局部模型中独特的特征提取器能够在生成的联邦模型中保留更多信息而较少稀释。采用锂离子电池的循环退化数据和涡轮风扇发动机的非循环数据,我们证明所提方法在健康状态估计和剩余寿命估计任务中分别实现了高达44.5%和39.3%的精度提升。