Beyond the Final Label: Exploiting the Untapped Potential of Classification Histories in Astronomical Light Curve Analysis

The Legacy Survey of Space and Time (LSST) on the Vera C. Rubin Observatory will generate a massive collection of time series (light curves) of the measured flux of transient and variable astronomical objects. With each new flux observation, light curve classifiers need to generate updated probability distributions over candidate classes, which will then be shared with the global community for the purpose of identifying interesting targets for follow-up observations as well as less time-sensitive analysis applications. Using the synthetic light curves and classification results of participating classifiers from the Extended LSST Astronomical Time-series Classification Challenge (ELAsTiCC), we investigate a novel framework to enhance existing light curve classifications by incorporating their classification histories and the temporal evolution of these histories. To demonstrate the potential of this approach, we introduce a model that combines a recurrent neural network and an additive attention module, which shows improved classification accuracy and more balanced precision-recall performance compared to existing classifiers from the challenge. Furthermore, at this stage, most, if not all, of the existing classifiers are evaluated by their final classification results on complete light curves; we propose new metrics that evaluate the stability, accuracy, and early classification performance of a classifier's predictions when using limited data by considering the Wasserstein distance between the temporally evolving classification probability distributions. Our metrics offer a more comprehensive perspective for model assessment by supplementing classical methods such as the confusion matrix and precision-recall.

翻译：薇拉·鲁宾天文台的遗迹空间与时间巡天（LSST）将产生大量瞬变和变天体的测量通量时间序列（光变曲线）。每新增一次通量观测时，光变曲线分类器需生成候选类别的最新概率分布，并共享给全球社区，用于识别后续观测目标及对时间敏感性较低的分析应用。基于扩展LSST天文时间序列分类挑战（ELAsTiCC）中参与分类器的合成光变曲线与分类结果，我们探索了一种新框架：通过整合分类历史及其时间演化过程来增强现有光变曲线分类。为展示该方法潜力，我们引入了一个结合循环神经网络与加性注意力模块的模型，其分类精度优于挑战中的现有分类器，且精确率-召回率表现更均衡。此外，现阶段大多数（若非全部）现有分类器均以完整光变曲线的最终分类结果进行评估；我们提出新指标，通过考虑时间演化分类概率分布之间的瓦瑟斯坦距离，评估分类器在有限数据下预测的稳定性、精度与早期分类性能。该指标通过补充混淆矩阵和精确率-召回率等经典方法，为模型评估提供了更全面的视角。

相关内容

分类器

关注 6

分类是数据挖掘的一种非常重要的方法。分类的概念是在已有数据的基础上学会一个分类函数或构造出一个分类模型（即我们通常所说的分类器(Classifier)）。该函数或模型能够把数据库中的数据纪录映射到给定类别中的某一个，从而可以应用于数据预测。总之，分类器是数据挖掘中对样本进行分类的方法的统称，包含决策树、逻辑回归、朴素贝叶斯、神经网络等算法。

【HKUST博士论文】迈向可扩展且具泛化能力的时空预测

专知会员服务

18+阅读 · 2025年6月27日

遥感图像超分辨率技术进展：综合综述

专知会员服务

12+阅读 · 2025年5月31日