Over the past few decades, many applications of physics-based simulations and data-driven techniques (including machine learning and deep learning) have emerged to analyze and predict solar flares. These approaches are pivotal in understanding the dynamics of solar flares, primarily aiming to forecast these events and minimize potential risks they may pose to Earth. Although current methods have made significant progress, there are still limitations to these data-driven approaches. One prominent drawback is the lack of consideration for the temporal evolution characteristics in the active regions from which these flares originate. This oversight hinders the ability of these methods to grasp the relationships between high-dimensional active region features, thereby limiting their usability in operations. This study centers on the development of interpretable classifiers for multivariate time series and the demonstration of a novel feature ranking method with sliding window-based sub-interval ranking. The primary contribution of our work is to bridge the gap between complex, less understandable black-box models used for high-dimensional data and the exploration of relevant sub-intervals from multivariate time series, specifically in the context of solar flare forecasting. Our findings demonstrate that our sliding-window time series forest classifier performs effectively in solar flare prediction (with a True Skill Statistic of over 85\%) while also pinpointing the most crucial features and sub-intervals for a given learning task.
翻译:过去几十年中,基于物理的模拟与数据驱动技术(包括机器学习和深度学习)的众多应用不断涌现,用于分析和预测太阳耀斑。这些方法对于理解太阳耀斑动力学至关重要,主要旨在预测这些事件并最大程度减少其对地球可能构成的潜在风险。尽管当前方法已取得显著进展,但这些数据驱动方法仍存在局限性。一个突出缺陷是缺乏对耀斑起源活动区时间演化特征的考虑。这种忽视阻碍了这些方法把握高维活动区特征间关系的能力,从而限制了其在业务运行中的实用性。本研究聚焦于开发可解释的多元时间序列分类器,并展示一种基于滑动窗口子区间排序的新型特征排序方法。我们的主要贡献在于,弥合了用于高维数据的复杂、难以理解的黑箱模型与从多元时间序列中探索相关子区间之间的差距,特别是在太阳耀斑预测背景下。研究结果表明,我们的滑动窗口时间序列森林分类器在太阳耀斑预测中表现有效(真实技能统计量超过85%),同时还能为给定学习任务精确定位最关键的特征和子区间。