Machine learning techniques in neutrino physics have traditionally relied on simulated data, which provides access to ground-truth labels. However, the accuracy of these simulations and the discrepancies between simulated and real data remain significant concerns, particularly for large-scale neutrino telescopes that operate in complex natural media. In recent years, self-supervised learning has emerged as a powerful paradigm for reducing dependence on labeled datasets. Here, we present the first self-supervised training pipeline for neutrino telescopes, leveraging point cloud transformers and masked autoencoders. By shifting the majority of training to real data, this approach minimizes reliance on simulations, thereby mitigating associated systematic uncertainties. This represents a fundamental departure from previous machine learning applications in neutrino telescopes, paving the way for substantial improvements in event reconstruction and classification.
翻译:中微子物理学中的机器学习技术传统上依赖于模拟数据,以获取真实标签。然而,这些模拟的准确性以及模拟数据与真实数据之间的差异仍然是重要关切,尤其是在复杂自然介质中运行的大型中微子望远镜领域。近年来,自监督学习已成为减少对标记数据集依赖的强大范式。本文首次提出了中微子望远镜的自监督训练流程,结合点云Transformer与掩码自编码器。通过将大部分训练转向真实数据,该方法最大限度地减少了对模拟的依赖,从而降低了相关的系统不确定性。这标志着中微子望远镜中机器学习应用的根本性转变,为事件重建与分类的实质性改进铺平了道路。