Training machine learning models for radioisotope identification using gamma spectroscopy remains an elusive challenge for many practical applications, largely stemming from the difficulty of acquiring and labeling large, diverse experimental datasets. Simulations can mitigate this challenge, but the accuracy of models trained on simulated data can deteriorate substantially when deployed to an out-of-distribution operational environment. In this study, we demonstrate that unsupervised domain adaptation (UDA) can improve the ability of a model trained on synthetic data to generalize to a new testing domain, provided unlabeled data from the target domain is available. Conventional supervised techniques are unable to utilize this data because the absence of isotope labels precludes defining a supervised classification loss. We compare a range of different UDA techniques, finding that feature alignment strategies, particularly via maximum mean discrepancy (MMD) minimization or domain-adversarial training, yield the most consistent improvement to testing scores. For instance, using a custom transformer-based neural network, we achieve a testing accuracy of $0.904 \pm 0.022$ on an experimental LaBr$_3$ test set after performing unsupervised feature alignment via MMD minimization, compared to $0.754 \pm 0.014$ before alignment. Overall, our results highlight the potential of using UDA to adapt a radioisotope classifier trained on synthetic data for real-world deployment.
翻译:使用伽马能谱进行放射性同位素识别的机器学习模型训练,在诸多实际应用中仍是一项具有挑战性的难题,这主要源于难以获取并标注大规模、多样化的实验数据集。尽管仿真数据可缓解这一挑战,但当模型从仿真数据训练后部署至分布外操作环境时,其准确性会显著下降。本研究表明,若目标域存在无标签数据,无监督域适应(UDA)可提升基于合成数据训练的模型在全新测试域中的泛化能力。传统监督技术因缺乏同位素标签而无法定义监督分类损失,故难以利用此类数据。我们比较了多种UDA技术,发现特征对齐策略(特别是通过最大均值差异(MMD)最小化或域对抗训练)能带来最稳定的测试分数提升。例如,基于自定义Transformer神经网络,在实验LaBr$_3$测试集上通过MMD最小化进行无监督特征对齐后,测试准确率达$0.904 \pm 0.022$,而对齐前仅为$0.754 \pm 0.014$。总体而言,本研究结果凸显了利用UDA将基于合成数据训练的同位素分类器适配至实际部署场景的潜力。