Autoencoders are powerful machine learning models used to compress information from multiple data sources. However, autoencoders, like all artificial neural networks, are often unidentifiable and uninterpretable. This research focuses on creating an identifiable and interpretable autoencoder that can be used to meld and combine climate data products. The proposed autoencoder utilizes a Bayesian statistical framework, allowing for probabilistic interpretations while also varying spatially to capture useful spatial patterns across the various data products. Constraints are placed on the autoencoder as it learns patterns in the data, creating an interpretable consensus that includes the important features from each input. We demonstrate the utility of the autoencoder by combining information from multiple precipitation products in High Mountain Asia.
翻译:自编码器是一种强大的机器学习模型,用于压缩来自多个数据源的信息。然而,自编码器与所有人工神经网络一样,通常难以识别且不可解释。本研究旨在构建一种可识别且可解释的自编码器,用于融合与组合气候数据产品。所提出的自编码器采用贝叶斯统计框架,既实现概率化解释,又能通过空间变分捕捉不同数据产品中的有效空间模式。在自编码器学习数据模式的过程中施加约束,从而形成包含各输入重要特征的可解释共识。我们通过融合高亚洲地区多个降水产品的信息,验证了该自编码器的实用性。