Cross-Temporal Spectrogram Autoencoder (CTSAE): Unsupervised Dimensionality Reduction for Clustering Gravitational Wave Glitches

The advancement of The Laser Interferometer Gravitational-Wave Observatory (LIGO) has significantly enhanced the feasibility and reliability of gravitational wave detection. However, LIGO's high sensitivity makes it susceptible to transient noises known as glitches, which necessitate effective differentiation from real gravitational wave signals. Traditional approaches predominantly employ fully supervised or semi-supervised algorithms for the task of glitch classification and clustering. In the future task of identifying and classifying glitches across main and auxiliary channels, it is impractical to build a dataset with manually labeled ground-truth. In addition, the patterns of glitches can vary with time, generating new glitches without manual labels. In response to this challenge, we introduce the Cross-Temporal Spectrogram Autoencoder (CTSAE), a pioneering unsupervised method for the dimensionality reduction and clustering of gravitational wave glitches. CTSAE integrates a novel four-branch autoencoder with a hybrid of Convolutional Neural Networks (CNN) and Vision Transformers (ViT). To further extract features across multi-branches, we introduce a novel multi-branch fusion method using the CLS (Class) token. Our model, trained and evaluated on the GravitySpy O3 dataset on the main channel, demonstrates superior performance in clustering tasks when compared to state-of-the-art semi-supervised learning methods. To the best of our knowledge, CTSAE represents the first unsupervised approach tailored specifically for clustering LIGO data, marking a significant step forward in the field of gravitational wave research. The code of this paper is available at https://github.com/Zod-L/CTSAE

翻译：激光干涉引力波天文台（LIGO）的进步显著提升了引力波探测的可行性与可靠性。然而，LIGO的高灵敏度使其容易受到瞬态噪声（即啁啾信号）的影响，这要求将其与真实的引力波信号有效区分。传统方法主要采用全监督或半监督算法进行啁啾分类与聚类。在未来跨主通道与辅助通道识别并分类啁啾的任务中，构建带有手工标记真实标签的数据集并不实用。此外，啁啾的模式会随时间变化，产生缺乏人工标注的新型啁啾。针对这一挑战，我们提出了跨时间谱图自编码器（CTSAE），一种首创的无监督方法，用于引力波啁啾的降维与聚类。CTSAE将新型四分支自编码器与卷积神经网络（CNN）和视觉Transformer（ViT）的混合架构相结合。为跨多分支进一步提取特征，我们引入了一种利用CLS（分类）标记的新型多分支融合方法。该模型在GravitySpy O3主通道数据集上训练与评估，在与最先进半监督学习方法的比较中，其在聚类任务上展现出优越性能。据我们所知，CTSAE是首个专为LIGO数据聚类量身定制的无监督方法，标志着引力波研究领域的重要进展。本文代码见https://github.com/Zod-L/CTSAE。