A deep learning model is often considered a black-box model, as its internal workings tend to be opaque to the user. Because of the lack of transparency, it is challenging to understand the reasoning behind the model's predictions. Here, we present an approach to making a deep learning-based solar storm prediction model interpretable, where solar storms include solar flares and coronal mass ejections (CMEs). This deep learning model, built based on a long short-term memory (LSTM) network with an attention mechanism, aims to predict whether an active region (AR) on the Sun's surface that produces a flare within 24 hours will also produce a CME associated with the flare. The crux of our approach is to model data samples in an AR as time series and use the LSTM network to capture the temporal dynamics of the data samples. To make the model's predictions accountable and reliable, we leverage post hoc model-agnostic techniques, which help elucidate the factors contributing to the predicted output for an input sequence and provide insights into the model's behavior across multiple sequences within an AR. To our knowledge, this is the first time that interpretability has been added to an LSTM-based solar storm prediction model.
翻译:深度学习模型常被视为黑箱模型,因其内部工作机制对用户而言往往是不透明的。由于缺乏透明度,理解模型预测背后的推理过程具有挑战性。本文提出一种方法,使基于深度学习的太阳风暴预测模型具备可解释性,其中太阳风暴包括太阳耀斑和日冕物质抛射。该深度学习模型基于带有注意力机制的长短期记忆网络构建,旨在预测太阳表面一个活动区域在24小时内产生耀斑的同时,是否也会产生与该耀斑相关的日冕物质抛射。我们方法的核心是将活动区域内的数据样本建模为时间序列,并利用LSTM网络捕捉数据样本的时序动态特征。为使模型的预测可追溯且可靠,我们采用事后模型无关技术,这些技术有助于阐明输入序列中对预测输出产生影响的因素,并提供对模型在活动区域内多个序列中行为的深入理解。据我们所知,这是首次在基于LSTM的太阳风暴预测模型中引入可解释性。