Galaxy morphology analysis involves classifying galaxies by their shapes and structures. For this task, directly training domain-specific models on large, annotated astronomical datasets is effective but costly. In contrast, fine-tuning vision foundation models on a smaller set of astronomical images is more resource-efficient but generally results in lower accuracy. To harness the benefits of both approaches and address their shortcomings, we propose GalaxAlign, a novel method that fine-tunes pre-trained foundation models to achieve high accuracy on astronomical tasks. Specifically, our method extends a contrastive learning architecture to align three types of data in fine-tuning: (1) a set of schematic symbols representing galaxy shapes and structures, (2) textual labels of these symbols, and (3) galaxy images. This way, GalaxAlign not only eliminates the need for expensive pretraining but also enhances the effectiveness of fine-tuning. Extensive experiments on galaxy classification and similarity search demonstrate that our method effectively fine-tunes general pre-trained models for astronomical tasks by incorporating domain-specific multi-modal knowledge.
翻译:星系形态分析涉及根据星系的形状和结构对其进行分类。对于此任务,直接在大型带标注的天文数据集上训练领域特定模型是有效的,但成本高昂。相比之下,在较小的天文图像集上微调视觉基础模型更具资源效率,但通常导致较低的准确性。为了利用两种方法的优势并解决其缺点,我们提出了GalaxAlign,一种新颖的方法,通过微调预训练的基础模型以在天文任务上实现高精度。具体而言,我们的方法扩展了对比学习架构,以在微调中对齐三种类型的数据:(1) 一组表示星系形状和结构的示意符号,(2) 这些符号的文本标签,以及(3) 星系图像。通过这种方式,GalaxAlign不仅消除了昂贵的预训练需求,还增强了微调的有效性。在星系分类和相似性搜索上的大量实验表明,我们的方法通过融入领域特定的多模态知识,有效地将通用的预训练模型微调用于天文任务。