Estimating the health state of turbofan engines is a challenging ill-posed inverse problem, hindered by sparse sensing and complex nonlinear thermodynamics. Research in this area remains fragmented, with comparisons limited by the use of unrealistic datasets and insufficient exploration of the exploitation of temporal information. This work investigates how to recover component-level health indicators from operational sensor data under realistic degradation and maintenance patterns. To support this study, we introduce a new dataset that incorporates industry-oriented complexities such as maintenance events and usage changes. Using this dataset, we establish an initial benchmark that compares steady-state and nonstationary data-driven models, and Bayesian filters, classic families of methods used to solve this problem. In addition to this benchmark, we introduce self-supervised learning (SSL) approaches that learn latent representations without access to true health labels, a scenario reflective of real-world operational constraints. By comparing the downstream estimation performance of these unsupervised representations against the direct prediction baselines, we establish a practical lower bound on the difficulty of solving this inverse problem. Our results reveal that traditional filters remain strong baselines, while SSL methods reveal the intrinsic complexity of health estimation and highlight the need for more advanced and interpretable inference strategies. For reproducibility, both the generated dataset and the implementation used in this work are made accessible.
翻译:涡扇发动机健康状态估计是一个具有挑战性的病态逆问题,其难点源于稀疏传感与复杂的非线性热力学过程。当前该领域的研究呈现碎片化状态:不切实际的数据集限制了方法间的公平比较,而时间信息的挖掘亦不充分。本文研究如何在真实退化与维护模式下,从运行传感器数据中恢复部件级健康指标。为此,我们构建了一个包含维护事件、工况变化等工业级复杂因素的新数据集。基于该数据集,我们建立了初步基准,对比了稳态与非稳态数据驱动模型、贝叶斯滤波器等经典方法家族。此外,我们引入自监督学习方法——其在无需真实健康标签(反映实际运行约束的场景)的情况下学习潜在表征。通过比较这些无监督表征的下游估计性能与直接预测基线,我们建立了求解该逆问题难度的实用下界。结果表明:传统滤波器仍保持强劲基线性能,而自监督方法揭示了健康估计的内在复杂性,凸显了对更先进可解释推理策略的需求。为保障可复现性,本研究所生成的数据集及实现代码均已公开。