Retrieval algorithms are used to estimate atmospheric concentrations of greenhouse gases (GHGs), such as carbon dioxide (CO2) and methane (CH4), by solving inverse problems from high-spectral-resolution satellite radiance measurements. However, these algorithms are computationally expensive, which makes real-time estimation at scale difficult. Machine-learning models have therefore been proposed as fast emulators of retrieval algorithms. Most existing studies, however, evaluate them only on test data from the same period as the training data. We study the stability over time of such emulators using data from the Greenhouse Gases Observing SATellite (GOSAT). We show that prediction accuracy generally deteriorates when the test period moves away from the training period. We also show that including time as an input feature substantially improves XCH4 prediction for Lasso and neural-network models. Among the methods considered, a simple Lasso model performs as well as or better than more complex methods such as neural networks, and yields more stable predictions over time. We further validate the results using the Total Carbon Column Observing Network (TCCON), a ground-based observation network. On the TCCON-matched dataset, the time-augmented Lasso achieves errors against TCCON that are comparable to the disagreement between GOSAT and TCCON for both XCO2 and XCH4.
翻译:反演算法通过高光谱分辨率卫星辐射测量数据求解逆问题,用于估算二氧化碳(CO₂)和甲烷(CH₄)等温室气体(GHGs)的大气浓度。然而,这些算法计算成本高昂,难以实现大规模的实时估算。因此,机器学习模型被提出作为反演算法的快速模拟器。然而,现有研究大多仅使用与训练数据同期的时间段测试数据对这些模型进行评估。本研究利用温室气体观测卫星(GOSAT)的数据,探究此类模拟器的时间稳定性。研究表明,当测试周期偏离训练周期时,预测精度通常会下降。同时发现,将时间作为输入特征可显著提升Lasso和神经网络模型对XCH₄的预测效果。在所考虑的方法中,简单的Lasso模型性能与神经网络等更复杂方法相当甚至更优,且随时间推移产生的预测结果更为稳定。我们进一步利用地面观测网络——总碳柱观测网络(TCCON)对结果进行验证。在TCCON匹配的数据集上,时间增强型Lasso模型对TCCON的误差与GOSAT与TCCON在XCO₂和XCH₄两种气体上的偏差相当。